Re: [HACKERS] Add CREATE support to event triggers

2014-12-07 Thread Michael Paquier
On Thu, Nov 27, 2014 at 10:16 AM, Bruce Momjian br...@momjian.us wrote:
 On Sat, Nov  8, 2014 at 05:56:00PM +0100, Andres Freund wrote:
 On 2014-11-08 11:52:43 -0500, Tom Lane wrote:
  Adding a similar
  level of burden to support a feature with a narrow use-case seems like
  a nonstarter from here.

 I don't understand this statement. In my experience the lack of a usable
 replication solution that allows temporary tables and major version
 differences is one of the most, if not *the* most, frequent criticisms
 of postgres I hear. How is this a narrow use case?

 How would replicating DDL handle cases where the master and slave
 servers have different major versions and the DDL is only supported by
 the Postgres version on the master server?  If it would fail, does this
 limit the idea that logical replication allows major version-different
 replication?
Marking this patch as Returned with feedback. Even with the
more-fundamental arguments of putting such replication solution
in-core or not (I am skeptical as well btw), on a
code-base-discussion-only I don't think that this patch is acceptable
as-is without more structure of jsonb to do on-memory manipulation of
containers (aka remove ObjTree stuff).
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-12-02 Thread Robert Haas
 Normally you would replicate between an older master and a newer
 replica, so this shouldn't be an issue.  I find it unlikely that we
 would de-support some syntax that works in an older version: it would
 break pg_dump, for one thing.

The most common way in which we break forward-compatibility is by
reserving additional keywords.  Logical replication can deal with this
by quoting all identifiers, so it's not a big issue.

It's not the only possible issue; it is certainly conceivable that we
could change something in the server and then try to change pg_dump to
compensate.  I think we wouldn't do that with anything very big, but
suppose spgist were supplanted by a new and better access method
tsigps.  Well, we might figure that there are few enough people
accustomed to the current syntax that we can get away with hacking the
pg_dump output, and yes, anyone with an existing dump from an older
system will have problems, but there won't be many of them, and they
can always adjust the dump by hand.  If we ever decided to do such a
thing, whatever syntax transformation we did in pg_dump would have to
be mimicked by any logical replication solution.

I think it's legitimate to say that this could be a problem, but I
think it probably won't be a problem very often, and I think when it
is a problem it probably won't be a very big problem, because we
already have good reasons not to break the ability to restore older
dumps on newer servers more often than absolutely necessary.

One of the somewhat disappointing things about this is that we're
talking about adding a lot of new deparsing code to the server that
probably, in some sense, duplicates pg_dump.  Perhaps in an ideal
world there would be a way to avoid that, but in the world we really
live in, there probably isn't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-27 Thread Bruce Momjian
On Wed, Nov 26, 2014 at 09:01:13PM -0500, Stephen Frost wrote:
 * Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
  Bruce Momjian wrote:
   How would replicating DDL handle cases where the master and slave
   servers have different major versions and the DDL is only supported by
   the Postgres version on the master server?
  
  Normally you would replicate between an older master and a newer
  replica, so this shouldn't be an issue.  I find it unlikely that we
  would de-support some syntax that works in an older version: it would
  break pg_dump, for one thing.

I like the idea of older master/new replica, but what about
bidirectional replication?

Would the create/alter/drop WAL locical structure remain consistent
across major versions, or would the code have to read at least one older
version?  Would we limit it to one?

 While I tend to agree with you that it's not something we're likely to
 do, how would it break pg_dump?  We have plenty of version-specific
 logic in pg_dump and we could certainly generate a different syntax in
 a newer version than we did in an older version, if the newer server was
 expecting something different.  We've always held that you should use
 the version of pg_dump which match the server you are moving *to*, after
 all.

pg_upgrade avoids having to deal with major version changes by
leveraging pg_dump.  Is it possible to have the new replica use the new
pg_dump to connect to the old master to get a CREATE command that it can
execute?  Would that avoid having to ship CREATE syntax?  What it
wouldn't help with is ALTER and DROP though.  (We do have ALTER but I
think only for inheritance cases.)

  In other words I view cross-version replication as a mechanism to
  upgrade, not something that you would use permanently.  Once you
  finish upgrading, promote the newer server and ditch the old master.
 
 I agree with this also.

   If it would fail, does this limit the idea that logical replication
   allows major version-different replication?
  
  Not in my view, at least.
 
 I'm all for having logical replication be a way to do major version
 different replication (particularly for the purposes of upgrades), but
 it shouldn't mean we can never de-support a given syntax.
 
 As one example, we've discussed a few times removing certain table-level
 privileges on the basis that they practically mean you own the table.
 Perhaps that'll never actually happen, but if it does, logical
 replication would need to deal with it.

Should we just tell the user not to modify the database schema during
this period?  Should we have a server mode which disables DDL?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-26 Thread Bruce Momjian
On Sat, Nov  8, 2014 at 05:56:00PM +0100, Andres Freund wrote:
 On 2014-11-08 11:52:43 -0500, Tom Lane wrote:
  Adding a similar
  level of burden to support a feature with a narrow use-case seems like
  a nonstarter from here.
 
 I don't understand this statement. In my experience the lack of a usable
 replication solution that allows temporary tables and major version
 differences is one of the most, if not *the* most, frequent criticisms
 of postgres I hear. How is this a narrow use case?

How would replicating DDL handle cases where the master and slave
servers have different major versions and the DDL is only supported by
the Postgres version on the master server?  If it would fail, does this
limit the idea that logical replication allows major version-different
replication?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-26 Thread Alvaro Herrera
Bruce Momjian wrote:
 On Sat, Nov  8, 2014 at 05:56:00PM +0100, Andres Freund wrote:
  On 2014-11-08 11:52:43 -0500, Tom Lane wrote:
   Adding a similar level of burden to support a feature with a
   narrow use-case seems like a nonstarter from here.
  
  I don't understand this statement. In my experience the lack of a
  usable replication solution that allows temporary tables and major
  version differences is one of the most, if not *the* most, frequent
  criticisms of postgres I hear. How is this a narrow use case?
 
 How would replicating DDL handle cases where the master and slave
 servers have different major versions and the DDL is only supported by
 the Postgres version on the master server?

Normally you would replicate between an older master and a newer
replica, so this shouldn't be an issue.  I find it unlikely that we
would de-support some syntax that works in an older version: it would
break pg_dump, for one thing.

In other words I view cross-version replication as a mechanism to
upgrade, not something that you would use permanently.  Once you
finish upgrading, promote the newer server and ditch the old master.

 If it would fail, does this limit the idea that logical replication
 allows major version-different replication?

Not in my view, at least.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-26 Thread Stephen Frost
* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
 Bruce Momjian wrote:
  How would replicating DDL handle cases where the master and slave
  servers have different major versions and the DDL is only supported by
  the Postgres version on the master server?
 
 Normally you would replicate between an older master and a newer
 replica, so this shouldn't be an issue.  I find it unlikely that we
 would de-support some syntax that works in an older version: it would
 break pg_dump, for one thing.

While I tend to agree with you that it's not something we're likely to
do, how would it break pg_dump?  We have plenty of version-specific
logic in pg_dump and we could certainly generate a different syntax in
a newer version than we did in an older version, if the newer server was
expecting something different.  We've always held that you should use
the version of pg_dump which match the server you are moving *to*, after
all.

 In other words I view cross-version replication as a mechanism to
 upgrade, not something that you would use permanently.  Once you
 finish upgrading, promote the newer server and ditch the old master.

I agree with this also.

  If it would fail, does this limit the idea that logical replication
  allows major version-different replication?
 
 Not in my view, at least.

I'm all for having logical replication be a way to do major version
different replication (particularly for the purposes of upgrades), but
it shouldn't mean we can never de-support a given syntax.

As one example, we've discussed a few times removing certain table-level
privileges on the basis that they practically mean you own the table.
Perhaps that'll never actually happen, but if it does, logical
replication would need to deal with it.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Add CREATE support to event triggers

2014-11-15 Thread Robert Haas
On Fri, Nov 14, 2014 at 1:18 PM, Andres Freund and...@2ndquadrant.com wrote:
 I think it's a good idea to structure independent features in a way that
 other solutions can reuse them. But I sure as hell can't force them to
 use it - especially as there's unfortunately not too much development
 going on in the existing logical replication solutions for postgres.

 Btw, I really think there's quite legitimate use cases for this besides
 logical replication solutions - e.g. schema tracking is quite a sensible
 use case.

Well, as I already said, despite my doubts about the general utility
of this feature, I'm willing to see us take it IF we have a testing
framework that will reliably catch bugs, including bugs of omission.
Without that, I'm very confident it's going to be a maintenance
nightmare, and I believe you admitted yourself that that concern was
reasonable.

 Can you please hint at some other workable design? I don't think there
 really is anything else.

I think this really depends on what you mean by anything else.  Any
DDL replication solution is necessarily going to involve the following
steps:

1. Define some set of primitives such that any imaginable DDL
operation can be expressed as a series of those primitives.
2. Find a way to capture those events as they happen.
3. Serialize them into some wire format and transport that format to
the replica.
4. Apply them, possibly coordinating in some way with the master so
that the user's original request fails if the apply fails.

There are meaningful choices at every step.  You're proposing that the
primitives should be anything that can be expressed as a complete SQL
command against a single object (I think - what are you going to emit
for an ALL IN SCHEMA op - that thing itself, or a similar operation
against each object in the schema?); that the capture mechanism should
be an event trigger that inserts into a queue table; that the
serialization format should be a JSON language designed to allow
reassembly of the corresponding SQL statement; and that the apply
coordination mechanism should be 2PC.  But none of that is written in
stone.

As far as deciding what primitives to use, I think the principal
decision to be made here is as to the appropriate level of
granularity.  For example, CREATE TABLE foo (a int DEFAULT 1, b int,
CHECK (b  42)) could be emitted as a single event saying that a table
was created.  But it could also be emitted as create-table (foo),
add-column (foo, a, int), add-column (foo, b, int), add-column-default
(a, 1), add-check-constraint (foo, b  42).  The appeal of a more
fine-grained set of primitives is that there might be fewer of them,
and that each of them might be simpler; one of the things that makes
physical replication so reliable is that its primitives are very
simple and thus easy to verify.

The possible event-capture mechanisms seem to be to have either (a)
event trigger or (b) a hook function in some yet-to-be-defined place
or (c) core code which will either (i) write each event to a table,
(ii) write each event directly into WAL, or perhaps (iii) write it
someplace else (file? in-memory queue?  network socket?).

There are lots of possible serialization formats.

Coordinating with the master could involve 2PC, as you propose; or
trying to somehow validate that a given series of events is a valid
state transformation based on the starting state on the standby before
doing the operation on the master; or the use of a distributed
transaction coordinated by something like PG-XC's global transaction
manager; or you can skip it and hope for the best.

In addition to the decisions above, you can try to prevent failures by
restricting certain changes from happening, or you can let users
change what they like and hope for the best.  Different solutions can
have different mechanisms for controlling which objects are under
replication and which changes are not; or even allowing some
individual DDL statements to opt out of replication while forcing
others to participate.  Administratively, solutions can be built to
facilitate easy replication of an entire database to another node, or
more specific applications like sharding, where creating a table on a
master node creates child tables on a bunch of slave nodes, but
they're not all identical, because we're partitioning the data so that
only some of it will be on each node - thus the constraints and so on
will be different.

BDR has one set of answers to all of these questions, and despite my
worries about a few points here and there, they are not stupid
answers.  But they are not the ONLY answers either.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-14 Thread Robert Haas
On Thu, Nov 13, 2014 at 7:45 AM, Andres Freund and...@2ndquadrant.com wrote:
 Right.  And that's why it's cool that logical decoding can operate
 through DDL differences.  The apply side might not be able to cope
 with what pops out, but that's not logical decoding's fault, and
 different apply-sides can adopt different policies as to how to deal
 with whatever problems crop up.

 I think pretty much all of the solutions just say oops, you're on your
 own. And I can't blame them for that. Once there's a schema difference
 and it causes problem there's really not much that can be done.

Agreed.

 I don't know exactly what you mean by a normal single master setup.
 Surely the point of logical decoding is that the replica might not be
 identical to the master.

 I actually think that that's not primary the point if you talk about
 individual objects. The majority of objects will be exactly the same on
 all nodes. If you actually want to have differening objects on the nodes
 you'll have to opt out/in (depending on your solution) of ddl
 replication for those objects.

I'm not saying it's the primary point; I'm just saying that it can happen.

 And if it isn't, then a command that
 succeeded on the master might fail on the standby - for example,
 because an object by that name already exists there, or because a type
 doesn't exist there.  (Even if you replicate a CREATE EXTENSION
 command, there's no guarantee that the .so exists on the target.) Then
 what?

 Sure. There's reasons logical replication isn't always a win. But I
 don't see why that's a reason not to make it as robust as possible.

I agree.

 Btw, the .so problem exists for wal shipping as as well.

Not in the same way.  You might not be able to access the data on the
standby if the .so defining the datatype isn't present, but the
replication itself doesn't care.

 This is basically the same problem as multi-master replication
 conflicts, except with DDL.  Resolving replication conflicts is not a
 very easy thing to get right even if you're only concerned about the
 rows in the tables.  It's probably harder if you're worried about the
 schema, too.

 I don't think it's a sane thing to do multimaster with differing schemas
 for individual relations, except maybe additional nonunique indexes.

But that's exactly what you ARE doing, isn't it?  I mean, if you
replicate in only one direction, nothing keeps someone from modifying
things on the replica independently of BDR, and if you replicate in
both directions, well that's multi-master.

  The solution here doesn't force you to do that, does it? It's something
  that can be used by more than replication solution?

 In theory, yes.

 What's the practical point here?

I am quite doubtful about whether there will ever be a second, working
implementation, so I see all of this code - and the maintenance effort
associated with it - as something that will really only benefit BDR.
I understand that you don't see it that way, and I'm not saying that
you are offering anything in bad faith, but it looks to me like even
with all of this very substantial amount of infrastructure, you're
still going to need a big pile of additional code inside BDR to
actually make it work, and I don't hear anyone else offering to
develop something similar.

 I think generally logical replication has more failure cases than
 physical ones. Which you seem to agree with.

Yes, I agree with that.

 Physical replication is basically no-fail because it just
 says, hey, go write these bytes into this page, and we can pretty much
 always do that. But statement-based logical replication means
 basically executing arbitrary chunks of code all over the backend, and
 there is just no way to guarantee that code won't throw an error.  So
 the best we can do is to hope that those errors will get reported back
 to the user, which is going to require some kind of distributed
 transaction.  Your idea to just run the replicated DDL statements on
 the standby before committing on the master is one approach to that
 problem, and probably the simplest one, but not the only one - one can
 imagine something that resembles true clustering, for example.

 I think that's generally not what people need for primary/standby
 cases (of subsets of tables). In practice there aren't many failures
 like that besides schema differences. And there's just many usecases
 that aren't doable with physical replication, so we can't advise people
 doing that.

I can't really follow this, especially the first sentence.

 By the way, the fact that you're planning to do log-based replication
 of DML and trigger-based replication of DDL scares the crap out of me.
 I'm not sure how that's going to work at all if the two are
 interleaved in the same transaction.

 Maybe that's based on a misunderstanding. All the event trigger does is
 insert a event into a (local) queue. That's then shipped to the other
 side using the same replication mechanisms as used for rows. Then, when
 

Re: [HACKERS] Add CREATE support to event triggers

2014-11-14 Thread Andres Freund
On 2014-11-14 12:38:52 -0500, Robert Haas wrote:
  This is basically the same problem as multi-master replication
  conflicts, except with DDL.  Resolving replication conflicts is not a
  very easy thing to get right even if you're only concerned about the
  rows in the tables.  It's probably harder if you're worried about the
  schema, too.
 
  I don't think it's a sane thing to do multimaster with differing schemas
  for individual relations, except maybe additional nonunique indexes.
 
 But that's exactly what you ARE doing, isn't it?  I mean, if you
 replicate in only one direction, nothing keeps someone from modifying
 things on the replica independently of BDR, and if you replicate in
 both directions, well that's multi-master.

Meh. By that definition any logical replication solution does multi
master replication. Either you tell your users that that's not allowed,
or you just prevent it by technical means. Absolutely the same is true
for table contents.

FWIW, in BDR we *do* prevent schemas from being modified independently
on different nodes (unless you set the 'running with scissors' guc).

 I am quite doubtful about whether there will ever be a second, working
 implementation, so I see all of this code - and the maintenance effort
 associated with it - as something that will really only benefit BDR.
 I understand that you don't see it that way, and I'm not saying that
 you are offering anything in bad faith, but it looks to me like even
 with all of this very substantial amount of infrastructure, you're
 still going to need a big pile of additional code inside BDR to
 actually make it work, and I don't hear anyone else offering to
 develop something similar.

I don't know what to say about this. I don't see any other realistic way
to perform DDL replication in logical rep, and nearly all my
conversations with users have indicated that they want that.

I think it's a good idea to structure independent features in a way that
other solutions can reuse them. But I sure as hell can't force them to
use it - especially as there's unfortunately not too much development
going on in the existing logical replication solutions for postgres.

Btw, I really think there's quite legitimate use cases for this besides
logical replication solutions - e.g. schema tracking is quite a sensible
use case.

  By the way, the fact that you're planning to do log-based replication
  of DML and trigger-based replication of DDL scares the crap out of me.
  I'm not sure how that's going to work at all if the two are
  interleaved in the same transaction.
 
  Maybe that's based on a misunderstanding. All the event trigger does is
  insert a event into a (local) queue. That's then shipped to the other
  side using the same replication mechanisms as used for rows. Then, when
  receiving changes in that ddl queue the standby performs those actions
  before continuing with the replay.
  That makes the interleaving on the standby to be pretty much the same as
  on the primary.
 
 OK.  But that's also not something that I can really imagine us ever
 adopting in core.

Well, that bit really depends on what the user (e.g. a replication
solution, or a schema tracking feature) needs. There's certain things
that you can quite easily do as part of core (e.g. insert something into
the WAL stream), that you just can't as external code.

I don't think there's any external replication solution that won't have
some form of internal queue to manipulate its internal state. For an
external solution such a queue currently pretty much has to be some
table, but for an eventual in core solution it could be done
differently.

 If we were going to do DDL replication in core, I have to believe we'd
 find a way to put all of the information in the WAL stream, not use
 triggers.

I agree that we might want to represent the transport to standbys
differently for something in core. That there's many different ways the
deparsing output could be used imo is a good reason why that part of the
mechanism isn't part of this submission.

I don't really understand the arguments against triggers in general
though. We're already using them quite extensively - I don't see why DDL
replication has to meet some completely different bar than say foreign
key checks or deferred uniqueness checks. We easily can add 'implicit'
event triggers, that aren't defined inside the catalog if we feel like
it. I'm just not sure we really would need/want to.

  Yes, it's not trivial. And I think there's some commands where you might
  not want to try but either scream ERROR or just rereplicate the whole
  relation or such.
 
  I very strongly feel that we (as postgres devs) have a *much* higher
  chance of recognizing these cases than either some random users (that
  write slonik scripts or similar) or some replication solution authors
  that aren't closely involved with -hackers.
 
 It's clearly the job of the replication solution authors to figure
 these details out.  I'm not going to 

Re: [HACKERS] Add CREATE support to event triggers

2014-11-13 Thread Robert Haas
On Wed, Nov 12, 2014 at 4:58 PM, Andres Freund and...@2ndquadrant.com wrote:
 That's already the situation today with all the logical replication
 solutions. They *constantly* break in the field. Most commonly because
 of DDL differences.

Right.  And that's why it's cool that logical decoding can operate
through DDL differences.  The apply side might not be able to cope
with what pops out, but that's not logical decoding's fault, and
different apply-sides can adopt different policies as to how to deal
with whatever problems crop up.

 I don't understand why you think it's likely for logical replication to
 break due to this? You mean because deparse yielded a invalid statement?
 In a normal single master setup there really shouldn't be scenarios
 where that happens? Except bugs - but as you know we had more than in
 HS/SR as well?

I don't know exactly what you mean by a normal single master setup.
Surely the point of logical decoding is that the replica might not be
identical to the master.  And if it isn't, then a command that
succeeded on the master might fail on the standby - for example,
because an object by that name already exists there, or because a type
doesn't exist there.  (Even if you replicate a CREATE EXTENSION
command, there's no guarantee that the .so exists on the target.) Then
what?

This is basically the same problem as multi-master replication
conflicts, except with DDL.  Resolving replication conflicts is not a
very easy thing to get right even if you're only concerned about the
rows in the tables.  It's probably harder if you're worried about the
schema, too.

 We are thinking about extending 2PC to be usable across logical
 decoding. That's a relatively simple patch. Then it's possible to do the
 DDL on the primary, ship it to the standby, apply it there, and only
 afterwards commit the prepared xact if that was successfull.  That's
 quite cool - but somewhat in the remit of the replication solution.

That would certainly make the user aware of a quite a few kinds of
errors that might otherwise go undetected.

 The solution here doesn't force you to do that, does it? It's something
 that can be used by more than replication solution?

In theory, yes.

 I just don't see the alternative you're proposing? I've so far not even
 seen a credible *sketch* of an alternative design that also can handle
 ALTER.  The only current alternatives are 1) the user inserts some
 events into the queue manually. If they depend on any local state you're
 screwed. If they have syntax errors they're often screwed. 2). The user
 does all actions on the standby first. Then on the primary. That's hard
 for ALTER ADD COLUMN and similar, and just about impossible for renaming
 things.

It's a really hard problem.

I don't think it's possible to make statement-based replication
no-fail.  Physical replication is basically no-fail because it just
says, hey, go write these bytes into this page, and we can pretty much
always do that. But statement-based logical replication means
basically executing arbitrary chunks of code all over the backend, and
there is just no way to guarantee that code won't throw an error.  So
the best we can do is to hope that those errors will get reported back
to the user, which is going to require some kind of distributed
transaction.  Your idea to just run the replicated DDL statements on
the standby before committing on the master is one approach to that
problem, and probably the simplest one, but not the only one - one can
imagine something that resembles true clustering, for example.

By the way, the fact that you're planning to do log-based replication
of DML and trigger-based replication of DDL scares the crap out of me.
I'm not sure how that's going to work at all if the two are
interleaved in the same transaction.  Also, relying on triggers for
replication is generally not awesome, because it increases
administrative complexity.  Event triggers are probably better in that
regard than ordinary triggers, because they're database-wide, but I
don't think they solve the problem completely.  But the thing that
scares me even more is that the DDL replication is not only
trigger-based, but statement-based.  Why don't we do logical
replication by recording all of the SQL statements that run on the
master and re-executing them on the standby?  Well, because we all
know that there will be plenty of important cases where that doesn't
yield the same results on both servers.  There's no intrinsic reason
why that shouldn't also be a problem for DDL replication, and indeed
it is.  This patch set is trying to patch around that by finding a way
to emit a revised DDL statement that is guaranteed to do exactly the
same thing on both machines, and it's probably possible to do that in
most cases, but it's probably not possible to do that in all cases.

To be clear, none of this is a reason to reject this patch set; I've
explained my reasons for being unhappy with it elsewhere, and I think
they are 

Re: [HACKERS] Add CREATE support to event triggers

2014-11-13 Thread Andres Freund
On 2014-11-13 07:17:32 -0500, Robert Haas wrote:
 On Wed, Nov 12, 2014 at 4:58 PM, Andres Freund and...@2ndquadrant.com wrote:
  That's already the situation today with all the logical replication
  solutions. They *constantly* break in the field. Most commonly because
  of DDL differences.
 
 Right.  And that's why it's cool that logical decoding can operate
 through DDL differences.  The apply side might not be able to cope
 with what pops out, but that's not logical decoding's fault, and
 different apply-sides can adopt different policies as to how to deal
 with whatever problems crop up.

I think pretty much all of the solutions just say oops, you're on your
own. And I can't blame them for that. Once there's a schema difference
and it causes problem there's really not much that can be done.

  I don't understand why you think it's likely for logical replication to
  break due to this? You mean because deparse yielded a invalid statement?
  In a normal single master setup there really shouldn't be scenarios
  where that happens? Except bugs - but as you know we had more than in
  HS/SR as well?
 
 I don't know exactly what you mean by a normal single master setup.
 Surely the point of logical decoding is that the replica might not be
 identical to the master.

I actually think that that's not primary the point if you talk about
individual objects. The majority of objects will be exactly the same on
all nodes. If you actually want to have differening objects on the nodes
you'll have to opt out/in (depending on your solution) of ddl
replication for those objects.

 And if it isn't, then a command that
 succeeded on the master might fail on the standby - for example,
 because an object by that name already exists there, or because a type
 doesn't exist there.  (Even if you replicate a CREATE EXTENSION
 command, there's no guarantee that the .so exists on the target.) Then
 what?

Sure. There's reasons logical replication isn't always a win. But I
don't see why that's a reason not to make it as robust as possible.

Btw, the .so problem exists for wal shipping as as well.

 This is basically the same problem as multi-master replication
 conflicts, except with DDL.  Resolving replication conflicts is not a
 very easy thing to get right even if you're only concerned about the
 rows in the tables.  It's probably harder if you're worried about the
 schema, too.

I don't think it's a sane thing to do multimaster with differing schemas
for individual relations, except maybe additional nonunique indexes.

  We are thinking about extending 2PC to be usable across logical
  decoding. That's a relatively simple patch. Then it's possible to do the
  DDL on the primary, ship it to the standby, apply it there, and only
  afterwards commit the prepared xact if that was successfull.  That's
  quite cool - but somewhat in the remit of the replication solution.
 
 That would certainly make the user aware of a quite a few kinds of
 errors that might otherwise go undetected.

It is (or rather would be) a generally quite cool feature imo ;)

  The solution here doesn't force you to do that, does it? It's something
  that can be used by more than replication solution?
 
 In theory, yes.

What's the practical point here?

  I just don't see the alternative you're proposing? I've so far not even
  seen a credible *sketch* of an alternative design that also can handle
  ALTER.  The only current alternatives are 1) the user inserts some
  events into the queue manually. If they depend on any local state you're
  screwed. If they have syntax errors they're often screwed. 2). The user
  does all actions on the standby first. Then on the primary. That's hard
  for ALTER ADD COLUMN and similar, and just about impossible for renaming
  things.
 
 It's a really hard problem.
 
 I don't think it's possible to make statement-based replication
 no-fail.

I think generally logical replication has more failure cases than
physical ones. Which you seem to agree with.

 Physical replication is basically no-fail because it just
 says, hey, go write these bytes into this page, and we can pretty much
 always do that. But statement-based logical replication means
 basically executing arbitrary chunks of code all over the backend, and
 there is just no way to guarantee that code won't throw an error.  So
 the best we can do is to hope that those errors will get reported back
 to the user, which is going to require some kind of distributed
 transaction.  Your idea to just run the replicated DDL statements on
 the standby before committing on the master is one approach to that
 problem, and probably the simplest one, but not the only one - one can
 imagine something that resembles true clustering, for example.

I think that's generally not what people need for primary/standby
cases (of subsets of tables). In practice there aren't many failures
like that besides schema differences. And there's just many usecases
that aren't doable with physical 

Re: [HACKERS] Add CREATE support to event triggers

2014-11-12 Thread Robert Haas
On Mon, Nov 10, 2014 at 9:02 PM, Jim Nasby jim.na...@bluetreble.com wrote:
 +1. Adding columns is a PITA, you have to manually ensure you do it on all
 slaves first.

 Drop is somewhat worse, because you have to do it on the master first,
 opposite of the (more usual) case of adding a column.

 RENAME is a complete disaster.

 Handing scripts to your replication system to execute isn't a very good
 alternative either; it assumes that you actually have a script (bad
 assumption with ORMs), and that you have a reasonable way to get that script
 to wherever you run your replication system.

I don't disagree with any of that, but running the command on the
master and then propagating it to the slaves where it may succeed or
fail - and if it fails, you won't know unless you're watching the logs
on those machines, and, oh by the way, replication will also be broken
- is not good either.  We would never have shipped physical
replication solution with that kind of limitation.  What has made
streaming replication so popular and successful with PostgreSQL users
over the last five years is that, while it's a bit of a pain to get
set up, once you have it set up, it is rock-solid.  If there were a
series of legal SQL commands that you could execute without error on a
cluster of servers connected by streaming replication such that, when
you got done, replication was broken, our users would scream bloody
murder, or just stop using PostgreSQL.  I think the approach to DDL
replication that Alvaro, Andres, et al. are proposing here is
absolutely fine - even praiseworthy - as an out-of-core solution that
users can adopt if they are willing to accept the associated risks, as
many users probably will be.  But you wouldn't convince me to run it
on any production system for which I was responsible.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-12 Thread Andres Freund
On 2014-11-12 16:36:30 -0500, Robert Haas wrote:
 On Mon, Nov 10, 2014 at 9:02 PM, Jim Nasby jim.na...@bluetreble.com wrote:
  +1. Adding columns is a PITA, you have to manually ensure you do it on all
  slaves first.
 
  Drop is somewhat worse, because you have to do it on the master first,
  opposite of the (more usual) case of adding a column.
 
  RENAME is a complete disaster.
 
  Handing scripts to your replication system to execute isn't a very good
  alternative either; it assumes that you actually have a script (bad
  assumption with ORMs), and that you have a reasonable way to get that script
  to wherever you run your replication system.
 
 I don't disagree with any of that, but running the command on the
 master and then propagating it to the slaves where it may succeed or
 fail - and if it fails, you won't know unless you're watching the logs
 on those machines, and, oh by the way, replication will also be broken
 - is not good either.

That's already the situation today with all the logical replication
solutions. They *constantly* break in the field. Most commonly because
of DDL differences.

I don't understand why you think it's likely for logical replication to
break due to this? You mean because deparse yielded a invalid statement?
In a normal single master setup there really shouldn't be scenarios
where that happens? Except bugs - but as you know we had more than in
HS/SR as well?

Or are you worried about stuff like ALTER TABLE ... USING()? I think
that's the replication solution's job to take care of/prevent.

For multimaster the situation is more complex, I agree, but I don't
think in core stuff needs to solve that for now?

We are thinking about extending 2PC to be usable across logical
decoding. That's a relatively simple patch. Then it's possible to do the
DDL on the primary, ship it to the standby, apply it there, and only
afterwards commit the prepared xact if that was successfull.  That's
quite cool - but somewhat in the remit of the replication solution.

 I think the approach to DDL
 replication that Alvaro, Andres, et al. are proposing here is
 absolutely fine - even praiseworthy - as an out-of-core solution that
 users can adopt if they are willing to accept the associated risks, as
 many users probably will be.  But you wouldn't convince me to run it
 on any production system for which I was responsible.

The solution here doesn't force you to do that, does it? It's something
that can be used by more than replication solution?

I just don't see the alternative you're proposing? I've so far not even
seen a credible *sketch* of an alternative design that also can handle
ALTER.  The only current alternatives are 1) the user inserts some
events into the queue manually. If they depend on any local state you're
screwed. If they have syntax errors they're often screwed. 2). The user
does all actions on the standby first. Then on the primary. That's hard
for ALTER ADD COLUMN and similar, and just about impossible for renaming
things.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-10 Thread Christopher Browne
On 8 November 2014 17:49, Robert Haas robertmh...@gmail.com wrote:
  We could just integrate those parts, and be done with it. But would that
  actually be a good thing for the community? Then slony needs to do it
  and potentially others as well? Then auditing can't use it? Then
  potential schema tracking solutions can't use it?

 Do you think Slony is really going to use this?  I guess we can let
 the Slony guys speak for themselves, but I've been skeptical since day
 one that this is the best way to do DDL replication, and I still am.
 There are lots of ways that a replicated DDL statement can fail on the
 replicas, and what are you going to do then?  It's too late to show
 the user the error message, so you can throw it in a log someplace and
 hope that somebody notices, but that's it.  It makes a lot more sense
 to me to use some kind of a tool that applies the DDL in a coordinated
 fashion on all nodes - or even just do it manually, since it might
 very well be desirable to take the lock on different nodes at widely
 different times, separated by a switchover.  I certainly think there's
 a use-case for what you're trying to do here, but I don't think it'll
 be right for everyone.

 Certainly, if the Slony guys - or some other team building an
 out-of-core replication solutions says, hey, we really want this in
 core, that would considerably strengthen the argument for putting it
 there.  But I haven't heard anyone say that yet - unlike logical
 decoding, were we did have other people expressing clear interest in
 using it.

  There've been people for a long while asking about triggers on catalogs
  for that purpose. IIRC Jan was one of them.

 My impression, based on something Christopher Brown said a few years
 ago, is that Slony's DDL trigger needs are largely satisfied by the
 existing event trigger stuff.  It would be helpful to get confirmation
 as to whether that's the case.

I'm not sure that a replication system that intends to do partial
replication
(e.g. - being selective of what objects are to be replicated) will
necessarily
want to use the CREATE event triggers to capture creates.

Several cases pop up with different answers:
a) I certainly don't want to replicate temporary tables
b) I almost certainly don't want to replicate unlogged tables
c) For more ordinary tables, I'm not sure I want to extend Slony
to detect them and add them automatically, because there
are annoying sub-cases

   c.1) If I'm working on data conversion, I may create not totally
 temporary tables that are nonetheless not worthy to replicate.
 (I'm working on such right now)

Long and short: it seems likely that I'd frequently NOT want all new tables
added to replication, at least not all of them, all the time.

What would seem valuable, to me, would be to have a CREATE event
trigger that lets me know the OID and/or fully qualified name of the new
object so that perhaps the replication system:

a) Has some kind of rule system to detect if it wants to replicate it,

b) Logs the change so a human might know later that there's new stuff
that probably ought to be replicated

c) Perhaps a human might put replication into a new suggestive
mode, a bit akin to Slony's EXECUTE SCRIPT, but where the human
essentially says, Here, I'm running DDL against this connection for a
while, and I'd be grateful if Postgres told Slony to capture all the new
tables and sequences and replicated them.

There are kind of two approaches:

a) Just capture the OIDs, and have replication go back later and grab
the table definition once the dust clears on the master

b) We need to capture ALL the DDL, whether CREATE or ALTER, and
forward it, altered to have fully qualified names on everything so that
we don't need to duplicate all the set search_path requests and
such.

I suppose there's also a third...

c) Have a capability to put an event trigger function in place that makes
DDL requests fail.

That's more useful than you'd think; if, by default, we make them fail,
and with an error messages such as
  DDL request failed as it was not submitted using slonik DDL TOOL

then we have protection against uncontrolled application of DDL.

DDL TOOL would switch off the fail trigger, possibly trying to
capture the DDL, or perhaps just capturing the statements passed
to it so they get passed everywhere.   (That heads back to a) and b);
what should get captured...)

I'm not sure that all of that is totally internally coherent, but I hope
there
are some ideas worth thinking about.
--
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?


Re: [HACKERS] Add CREATE support to event triggers

2014-11-10 Thread Petr Jelinek

On 10/11/14 23:37, Christopher Browne wrote:

On 8 November 2014 17:49, Robert Haas robertmh...@gmail.com
 
  My impression, based on something Christopher Brown said a few years
  ago, is that Slony's DDL trigger needs are largely satisfied by the
  existing event trigger stuff.  It would be helpful to get confirmation
  as to whether that's the case.

I'm not sure that a replication system that intends to do partial
replication
(e.g. - being selective of what objects are to be replicated) will
necessarily
want to use the CREATE event triggers to capture creates.

Several cases pop up with different answers:
a) I certainly don't want to replicate temporary tables
b) I almost certainly don't want to replicate unlogged tables
c) For more ordinary tables, I'm not sure I want to extend Slony
 to detect them and add them automatically, because there
 are annoying sub-cases

c.1) If I'm working on data conversion, I may create not totally
  temporary tables that are nonetheless not worthy to replicate.
  (I'm working on such right now)

Long and short: it seems likely that I'd frequently NOT want all new tables
added to replication, at least not all of them, all the time.


I don't see how this is problem with using CREATE event triggers, just 
put logic in your trigger that handles this, you get the object identity 
of the object that is being created/altered so you can get any info 
about it you wish and you can easily filter however you want.



There are kind of two approaches:

a) Just capture the OIDs, and have replication go back later and grab
the table definition once the dust clears on the master

b) We need to capture ALL the DDL, whether CREATE or ALTER, and
forward it, altered to have fully qualified names on everything so that
we don't need to duplicate all the set search_path requests and
such.



This is basically what this patch gives you (actually both the canonized 
command and the identity)?



I suppose there's also a third...

c) Have a capability to put an event trigger function in place that makes
DDL requests fail.

That's more useful than you'd think; if, by default, we make them fail,
and with an error messages such as
   DDL request failed as it was not submitted using slonik DDL TOOL



You can do that already, it's even the example in the event trigger 
documentation.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-10 Thread Andres Freund
On 2014-11-10 17:37:40 -0500, Christopher Browne wrote:
 On 8 November 2014 17:49, Robert Haas robertmh...@gmail.com wrote:
   We could just integrate those parts, and be done with it. But would that
   actually be a good thing for the community? Then slony needs to do it
   and potentially others as well? Then auditing can't use it? Then
   potential schema tracking solutions can't use it?
 
  Do you think Slony is really going to use this?  I guess we can let
  the Slony guys speak for themselves, but I've been skeptical since day
  one that this is the best way to do DDL replication, and I still am.
  There are lots of ways that a replicated DDL statement can fail on the
  replicas, and what are you going to do then?  It's too late to show
  the user the error message, so you can throw it in a log someplace and
  hope that somebody notices, but that's it.  It makes a lot more sense
  to me to use some kind of a tool that applies the DDL in a coordinated
  fashion on all nodes - or even just do it manually, since it might
  very well be desirable to take the lock on different nodes at widely
  different times, separated by a switchover.  I certainly think there's
  a use-case for what you're trying to do here, but I don't think it'll
  be right for everyone.
 
  Certainly, if the Slony guys - or some other team building an
  out-of-core replication solutions says, hey, we really want this in
  core, that would considerably strengthen the argument for putting it
  there.  But I haven't heard anyone say that yet - unlike logical
  decoding, were we did have other people expressing clear interest in
  using it.
 
   There've been people for a long while asking about triggers on catalogs
   for that purpose. IIRC Jan was one of them.
 
  My impression, based on something Christopher Brown said a few years
  ago, is that Slony's DDL trigger needs are largely satisfied by the
  existing event trigger stuff.  It would be helpful to get confirmation
  as to whether that's the case.
 
 I'm not sure that a replication system that intends to do partial
 replication
 (e.g. - being selective of what objects are to be replicated) will
 necessarily
 want to use the CREATE event triggers to capture creates.
 
 Several cases pop up with different answers:
 a) I certainly don't want to replicate temporary tables
 b) I almost certainly don't want to replicate unlogged tables

Those are quite easy to recognize and skip.

 c) For more ordinary tables, I'm not sure I want to extend Slony
 to detect them and add them automatically, because there
 are annoying sub-cases
 
c.1) If I'm working on data conversion, I may create not totally
  temporary tables that are nonetheless not worthy to replicate.
  (I'm working on such right now)

Sure. you might not want to do it automatically all the time - but I
think it's a very useful default mode. Once you can replicate CREATEs
per se, it's easy to add logic (in a couple lines of plpgsql or
whatever) to only do so in a certain schema or similar.

But the main reason all this is interesting isn't so much CREATE
itself. But that it can be (and Alvaro has mostly done it!) for ALTER as
well. And there it imo becomes really interesting. Because you can quite
easily check whether the affected relation is being replicated you can
just emit the DDL when that's the case. And that makes DDL in a
logically replicated setup *much* easier.

 Long and short: it seems likely that I'd frequently NOT want all new tables
 added to replication, at least not all of them, all the time.

Agreed. That's quite possible with the design here - you get the
creation commands and can decide whether you want to do anything with
them. You're not forced to insert them into your replication queue or
whatever you're using for that.

 What would seem valuable, to me, would be to have a CREATE event
 trigger that lets me know the OID and/or fully qualified name of the new
 object so that perhaps the replication system:
 
 a) Has some kind of rule system to detect if it wants to replicate it,

Sure.

 b) Logs the change so a human might know later that there's new stuff
 that probably ought to be replicated

Sure.

 c) Perhaps a human might put replication into a new suggestive
 mode, a bit akin to Slony's EXECUTE SCRIPT, but where the human
 essentially says, Here, I'm running DDL against this connection for a
 while, and I'd be grateful if Postgres told Slony to capture all the new
 tables and sequences and replicated them.

Sure.

Some of that already is possible with the current event triggers - and
all of it would be possible with the suggested functionality here.

An old version of bdr, employing the functionality presented here, had
the following (simplified) event trigger:

CREATE OR REPLACE FUNCTION bdr.queue_commands()
 RETURNS event_trigger
LANGUAGE plpgsql
AS $function$
DECLARE
r RECORD;
BEGIN
-- don't recursively log ddl commands
IF 

Re: [HACKERS] Add CREATE support to event triggers

2014-11-10 Thread Jim Nasby

On 11/10/14, 4:58 PM, Andres Freund wrote:

But the main reason all this is interesting isn't so much CREATE
itself. But that it can be (and Alvaro has mostly done it!) for ALTER as
well. And there it imo becomes really interesting. Because you can quite
easily check whether the affected relation is being replicated you can
just emit the DDL when that's the case. And that makes DDL in a
logically replicated setup*much*  easier.


+1. Adding columns is a PITA, you have to manually ensure you do it on all 
slaves first.

Drop is somewhat worse, because you have to do it on the master first, opposite 
of the (more usual) case of adding a column.

RENAME is a complete disaster.

Handing scripts to your replication system to execute isn't a very good 
alternative either; it assumes that you actually have a script (bad assumption 
with ORMs), and that you have a reasonable way to get that script to wherever 
you run your replication system.

I will also weigh in that there are a LOT of cases that binary replication doesn't cover. 
I find it interesting that prior to creating built in replication, the community stance 
was We won't build that because there's too many different use cases, but now 
some folks are saying that everyone should just use streaming rep and be done with it. :P
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Andres Freund
On 2014-11-07 21:41:17 -0500, Robert Haas wrote:
 On Fri, Nov 7, 2014 at 10:45 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Michael Paquier wrote:
 
  Here are more thoughts among those lines looking at the current state of
  the patch 4 that introduces the infrastructure of the whole feature. This
  would make possible in-memory manipulation of jsonb containers without
  relying on a 3rd-part set of APIs like what this patch is doing with
  ObjTree to deparse the DDL parse trees.
 
  Thanks for the thoughts.  I have to say that I have no intention of
  reworking the jsonb code.  If somebody else wants to do the legwork and
  add that API as you suggest, I'm happy to remove all the ObjTree stuff
  from this patch.  I don't expect this to happen too soon, though, so I
  would instead consider committing this patch based on ObjTree.  Later,
  when somebody comes around to reworking jsonb, we can rip ObjTree out.
 
  This infrastructure would allow in-memory manipulation of jsonb containers.
  Containers that can then be easily be manipulated to be changed back to
  strings and for value lookups using key strings.
 
  Honestly, I had hoped that the jsonb code would have already included
  this kind of functionality.  I wasn't too happy when I discovered that I
  needed to keep the ObjTree crap.  But I don't want to do that work
  myself.
 
 If we're going to have infrastructure for this in core, we really
 ought to make the effort to make it general instead of not.
 
 I still think this whole idea is a mess.  It adds what looks to be a
 LOT of code that, at least as I understand it, we have no compelling
 way to regression test

I don't understand why this is particularly difficult to regresssion
test. It actually is comparatively simple?

 and which will likely receive very limited
 testing from users to support a feature that is not in core,

Just like half of the features you worked on yourself lately? Why is
this an argument?

Being able to replicate DDL is a feature wish that has been around
pretty much since the inception of trigger based replication
solution. It's not some current fancy. And the json stuff only got there
because people wanted some way to manipulate the names in the replicated
- which this abstraction provides them with.

 may never be, and which I strongly suspect may be too clever by half.
 Once you've committed it, you're going to move onto other things and
 leave it to everyone who comes after to try to maintain it.  I bet
 we'll still be running into bugs and half-implemented features five
 years from now, and maybe in ten.  Ramming through special-purpose
 infrastructure instead of generalizing it is merely icing on the cake,
 but it's still moving in the wrong direction.

You're just as much to blame for not writing a general json abstraction
layer for EXPLAIN. I'd say that's not much blame, but still, there's
really not much difference there.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Michael Paquier
On Sat, Nov 8, 2014 at 12:45 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Michael Paquier wrote:

 Here are more thoughts among those lines looking at the current state of
 the patch 4 that introduces the infrastructure of the whole feature. This
 would make possible in-memory manipulation of jsonb containers without
 relying on a 3rd-part set of APIs like what this patch is doing with
 ObjTree to deparse the DDL parse trees.

 Thanks for the thoughts.  I have to say that I have no intention of
 reworking the jsonb code.  If somebody else wants to do the legwork and
 add that API as you suggest, I'm happy to remove all the ObjTree stuff
 from this patch.  I don't expect this to happen too soon, though, so I
 would instead consider committing this patch based on ObjTree.  Later,
 when somebody comes around to reworking jsonb, we can rip ObjTree out.
The thing freaking me out in this case is when would that really
happen? Maybe years from now, and perhaps at that point we would
regret to not have put in place the infrastructure that we knew we
could have done.

 This infrastructure would allow in-memory manipulation of jsonb containers.
 Containers that can then be easily be manipulated to be changed back to
 strings and for value lookups using key strings.

 Honestly, I had hoped that the jsonb code would have already included
 this kind of functionality.  I wasn't too happy when I discovered that I
 needed to keep the ObjTree crap.  But I don't want to do that work
 myself.
I can't blame you for that.

In any case, if this code goes in as-is (I am against it at this point
but I am just giving my opinion as a reviewer here), I think that at
least the following things could be done with a minimal effort:
- Provide the set of constructor functions for JsonbValue
- Move the jsonb APIs (find_*_in_jsonbcontainer) doing value lookups
using keys from ddl_json.c to jsonb_util.c, rename them to something
more consistent and complete the set for the other object types.
- Add a TODO item in the wiki, and a TODO item in where ObjTree is defined.
Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Robert Haas
On Sat, Nov 8, 2014 at 4:37 AM, Andres Freund and...@2ndquadrant.com wrote:
 I don't understand why this is particularly difficult to regresssion
 test. It actually is comparatively simple?

If it is, great.  I previously wrote this email:

http://www.postgresql.org/message-id/CA+TgmoZ=vzrijmxlkqi_v0jg4k4leapmwusc6rwxs5mquxu...@mail.gmail.com

Alvaro came up with a way of addressing the second point I raised
there, which I'm quite pleased about, but AFAIK there's been no
progress on the first one.  Maybe I missed something?

 and which will likely receive very limited
 testing from users to support a feature that is not in core,

 Just like half of the features you worked on yourself lately? Why is
 this an argument?

Because the stuff I'm adding doesn't break every time someone adds a
new DDL command, something that we do regularly.

If it were a question of writing this code once and being done with
it, that would be unobjectionable in my view.  But it isn't.
Practically every change to gram.y is going to require a corresponding
change to this stuff.  As far as I can see, nobody except me has
commented on the burden that places on everyone who may wish to add
syntax support for a new construct in the future, which might mean
that I'm worrying about something that isn't worth worrying about, but
what I think is more likely is that nobody's worrying about it right
now because they haven't had to do it yet.

Just to illustrate the point, consider the CREATE TABLE name OF type
syntax that Peter added a few years ago.   That patch
(e7b3349a8ad7afaad565c573fbd65fb46af6abbe) had the following impact on
gram.y:

 src/backend/parser/gram.y |   56 ++---
 1 file changed, 43 insertions(+), 13 deletions(-)

Now let's have a look at what impact it has on the deparsing code.
Patch 6 has deparse_ColumnDef_Typed from lines 134 to 193.  There's 3
or so lines in deparseTableElements that decide whether to call it.
Patch 8 has more handling for this case, lines 439 to 443 and 463 to
490.  So, if this feature had been committed before TABLE OF, it would
have needed about 100 lines of net new code to handle this case -
exclusive of refactoring.  The actual size of the patch would probably
have been modestly larger than that, because some code would need to
be reindented when it got iffed out, and quite possibly some
rearrangement would have been needed.  But even ignoring all that, the
deparse footprint of the patch would have been MORE THAN TWICE the
parser footprint.

I think that's going to be typical; and I think the deparse code is
going to be significantly more labor-intensive to write than bison
productions are.  Do you really think that's not going to be a burden?

 Being able to replicate DDL is a feature wish that has been around
 pretty much since the inception of trigger based replication
 solution. It's not some current fancy. And the json stuff only got there
 because people wanted some way to manipulate the names in the replicated
 - which this abstraction provides them with.

I understand that being able to replicate DDL is an important need,
and there may be no better way to do it than this.  But that doesn't
mean that this code is going to be bug-free or easy to maintain.

 may never be, and which I strongly suspect may be too clever by half.
 Once you've committed it, you're going to move onto other things and
 leave it to everyone who comes after to try to maintain it.  I bet
 we'll still be running into bugs and half-implemented features five
 years from now, and maybe in ten.  Ramming through special-purpose
 infrastructure instead of generalizing it is merely icing on the cake,
 but it's still moving in the wrong direction.

 You're just as much to blame for not writing a general json abstraction
 layer for EXPLAIN. I'd say that's not much blame, but still, there's
 really not much difference there.

Well, this patch is series is at least an order of magnitude larger,
and it's apparently doing significantly more complex stuff with JSON,
because the explain.c patch just writes it out into a StringInfo.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 If it were a question of writing this code once and being done with
 it, that would be unobjectionable in my view.  But it isn't.
 Practically every change to gram.y is going to require a corresponding
 change to this stuff.  As far as I can see, nobody except me has
 commented on the burden that places on everyone who may wish to add
 syntax support for a new construct in the future, which might mean
 that I'm worrying about something that isn't worth worrying about, but
 what I think is more likely is that nobody's worrying about it right
 now because they haven't had to do it yet.

I haven't been paying much attention to this thread, but I concur with
Robert that adding yet another set of overhead requirements to any
addition of new SQL is not a good thing.

 Just to illustrate the point, consider the CREATE TABLE name OF type
 syntax that Peter added a few years ago.   That patch
 (e7b3349a8ad7afaad565c573fbd65fb46af6abbe) had the following impact on
 gram.y:

This analysis is kind of cheating, because adding new syntax hasn't been
only a matter of touching gram.y for a very long time.  You invariably
have to touch pg_dump, and you have to touch ruleutils.c unless it's
strictly a DDL-command change.  But people are used to those, and the
value of keeping pg_dump working is clear to everybody.  Adding a similar
level of burden to support a feature with a narrow use-case seems like
a nonstarter from here.  ESPECIALLY if we also have to manually add
regression test cases because there's no easy way to test it directly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Andres Freund
On 2014-11-08 11:52:43 -0500, Tom Lane wrote:
 Adding a similar
 level of burden to support a feature with a narrow use-case seems like
 a nonstarter from here.

I don't understand this statement. In my experience the lack of a usable
replication solution that allows temporary tables and major version
differences is one of the most, if not *the* most, frequent criticisms
of postgres I hear. How is this a narrow use case?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-11-08 11:52:43 -0500, Tom Lane wrote:
 Adding a similar
 level of burden to support a feature with a narrow use-case seems like
 a nonstarter from here.

 I don't understand this statement. In my experience the lack of a usable
 replication solution that allows temporary tables and major version
 differences is one of the most, if not *the* most, frequent criticisms
 of postgres I hear. How is this a narrow use case?

[ shrug... ]  I don't personally give a damn about logical replication,
especially not logical replication implemented in this fashion.  It looks
large and rickety (ie full of opportunities for bugs) and I would never
trust data I cared about to it.

Or in short: AFAICS you're not building the next WAL-shipping replication
solution, you're building the next Slony, and Slony never has and never
will be more than a niche use-case.  Putting half of it into core wouldn't
fix that, it would just put a lot more maintenance burden on core
developers.  Core developers are entitled to push back on such proposals.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Andres Freund
On 2014-11-08 12:07:41 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-11-08 11:52:43 -0500, Tom Lane wrote:
  Adding a similar
  level of burden to support a feature with a narrow use-case seems like
  a nonstarter from here.
 
  I don't understand this statement. In my experience the lack of a usable
  replication solution that allows temporary tables and major version
  differences is one of the most, if not *the* most, frequent criticisms
  of postgres I hear. How is this a narrow use case?
 
 [ shrug... ]  I don't personally give a damn about logical replication,
 especially not logical replication implemented in this fashion.

In this fashion meaning ddl replication via event triggers? If you
have an actual suggestion how to do it better I'm all ears. So far
nobody has come up with anything.

 Or in short: AFAICS you're not building the next WAL-shipping replication
 solution, you're building the next Slony, and Slony never has and never
 will be more than a niche use-case.

A good number of the sites out there use either londiste or slony. Not
because they like it, but because there's no other alternative.

I'd love to simply say that we can make WAL based replication work
across versions, platforms and subsets of relations in PG
clusters. Since that seems quite unrealistic people have to go different
ways.

 Putting half of it into core wouldn't fix that, it would just put a
 lot more maintenance burden on core developers.

Imo stuff that can't be done sanely outside core needs to be put into
core if it's actually desired by many users. And working DDL replication
for logical replication solutions surely is.

 Core developers are entitled to push back on such proposals.

I'm not saying core developers (whover that is) aren't allowed to do
so. But just because they think something is (too) invasive doesn't make
it a niche application.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Robert Haas
On Sat, Nov 8, 2014 at 12:20 PM, Andres Freund and...@2ndquadrant.com wrote:
 Putting half of it into core wouldn't fix that, it would just put a
 lot more maintenance burden on core developers.

 Imo stuff that can't be done sanely outside core needs to be put into
 core if it's actually desired by many users. And working DDL replication
 for logical replication solutions surely is.

I don't buy it.  This patch series is *all about* transferring the
maintenance burden of this feature from the BDR developers to the core
project.  There's nothing to keep you from exposing the parse trees to
C functions that can live in an extension, and you can do all of this
deparsing there.  Nobody will stop you, and when it breaks (not if)
you can fix it in your code.  The overhead of deparsing new types of
parse nodes can be born by you.  The only benefit of pushing it into
core is that some other logical replication solution could also take
advantage of that, but I know of nobody with any plans to do such a
thing.  On the flip side, the *cost* of pushing it into core is that
it now becomes the PostgreSQL's community problem to update the
deparsing code every time the grammar changes.  That cost-benefit
trade-off does not look favorable to me, especially given the absence
of any kind of robust testing framework.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Andres Freund
On 2014-11-08 12:30:29 -0500, Robert Haas wrote:
 On Sat, Nov 8, 2014 at 12:20 PM, Andres Freund and...@2ndquadrant.com wrote:
  Putting half of it into core wouldn't fix that, it would just put a
  lot more maintenance burden on core developers.
 
  Imo stuff that can't be done sanely outside core needs to be put into
  core if it's actually desired by many users. And working DDL replication
  for logical replication solutions surely is.
 
 I don't buy it.  This patch series is *all about* transferring the
 maintenance burden of this feature from the BDR developers to the core
 project.

What? Not at all. It's *much* less work to do these kind of things out
of core. As you probably have experienced more than once. If it were
possible to do this entirely in a extension I'm pretty sure nobody would
have bothered.

 There's nothing to keep you from exposing the parse trees to
 C functions that can live in an extension, and you can do all of this
 deparsing there.

Not really. There's some core functions that need to be touched. Like
most of the stuff in patches 1,2,3,5,16 does.

We could just integrate those parts, and be done with it. But would that
actually be a good thing for the community? Then slony needs to do it
and potentially others as well? Then auditing can't use it? Then
potential schema tracking solutions can't use it?

 Nobody will stop you, and when it breaks (not if)
 you can fix it in your code. The overhead of deparsing new types of
 parse nodes can be born by you.  The only benefit of pushing it into
 core is that some other logical replication solution could also take
 advantage of that, but I know of nobody with any plans to do such a
 thing.

There've been people for a long while asking about triggers on catalogs
for that purpose. IIRC Jan was one of them.

 On the flip side, the *cost* of pushing it into core is that
 it now becomes the PostgreSQL's community problem to update the
 deparsing code every time the grammar changes.  That cost-benefit
 trade-off does not look favorable to me, especially given the absence
 of any kind of robust testing framework.

I agree that there should be testing in this.

But I think you're quite overstating the effort of maintaining
this. Looking at gram.y between the stamping of 9.3 devel and 9.4 devel
for commits that'd require DDL deparsing changes:
* ALTER TABLE .. ALTER CONSTRAINT: About 7 lines for the deparse code,
  out of a ~350 line patch
* REFRESH MATERIALIZED VIEW ... CONCURRENTLY: About 1 line for deparse,
  out of ~700
* WITH CHECK OPTION for views: About 3 lines for deparse, out of ~1300
* REPLICA IDENTITY: About 24 lines for deparse, out of ~950
* ALTER TABLESPACE MOVE: About 20 lines for deparse, out of
  ~340. Although that patch was essentially scrapped afterwards. And
  rewritten differently.
* CREATE TABLESPACE ... WITH ...:  Not supported by event triggers right
  now as it's a global object.

That's really not a whole lot.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Andres Freund
On 2014-11-08 10:42:15 -0500, Robert Haas wrote:
 On Sat, Nov 8, 2014 at 4:37 AM, Andres Freund and...@2ndquadrant.com wrote:
  I don't understand why this is particularly difficult to regresssion
  test. It actually is comparatively simple?
 
 If it is, great.  I previously wrote this email:
 
 http://www.postgresql.org/message-id/CA+TgmoZ=vzrijmxlkqi_v0jg4k4leapmwusc6rwxs5mquxu...@mail.gmail.com
 
 Alvaro came up with a way of addressing the second point I raised
 there, which I'm quite pleased about, but AFAIK there's been no
 progress on the first one.  Maybe I missed something?

I unfortunately don't think so. And that sounds like a completely
reasonable criticism.

 Just to illustrate the point, consider the CREATE TABLE name OF type
 syntax that Peter added a few years ago.   That patch
 (e7b3349a8ad7afaad565c573fbd65fb46af6abbe) had the following impact on
 gram.y:
 
  src/backend/parser/gram.y |   56 
 ++---
  1 file changed, 43 insertions(+), 13 deletions(-)

 Now let's have a look at what impact it has on the deparsing code.
 Patch 6 has deparse_ColumnDef_Typed from lines 134 to 193.  There's 3
 or so lines in deparseTableElements that decide whether to call it.
 Patch 8 has more handling for this case, lines 439 to 443 and 463 to
 490.  So, if this feature had been committed before TABLE OF, it would
 have needed about 100 lines of net new code to handle this case -
 exclusive of refactoring.  The actual size of the patch would probably
 have been modestly larger than that, because some code would need to
 be reindented when it got iffed out, and quite possibly some
 rearrangement would have been needed.  But even ignoring all that, the
 deparse footprint of the patch would have been MORE THAN TWICE the
 parser footprint.

Well, you disregarded the related costs of adjusting pg_dump et al, as
Tom mentioned, that's a significant part. And yes, there's some
additions that aren't entirely trivial to add. But the majority of
additions are pretty simple.

 I think that's going to be typical; and I think the deparse code is
 going to be significantly more labor-intensive to write than bison
 productions are.  Do you really think that's not going to be a burden?

I've looked into a fair number of cases and almost all are vastly
simpler than this. Most of the DDL changes that have been done lately
are things like adding IF NOT EXISTS somewhere; expanding existing
syntax for new types of objects (ALTER ... RENAME for foreign servers,
wrappers; LABEL for new types).

I think given the complexity of newly added features the overhead of
adding deparsing code isn't all that high.

  Being able to replicate DDL is a feature wish that has been around
  pretty much since the inception of trigger based replication
  solution. It's not some current fancy. And the json stuff only got there
  because people wanted some way to manipulate the names in the replicated
  - which this abstraction provides them with.
 
 I understand that being able to replicate DDL is an important need,
 and there may be no better way to do it than this.  But that doesn't
 mean that this code is going to be bug-free or easy to maintain.

Agreed. There's definitely no free lunch (here). This has been discussed
more than once, and so far I've read anything superior that also has a
chance of handling ALTER.

  may never be, and which I strongly suspect may be too clever by half.
  Once you've committed it, you're going to move onto other things and
  leave it to everyone who comes after to try to maintain it.  I bet
  we'll still be running into bugs and half-implemented features five
  years from now, and maybe in ten.  Ramming through special-purpose
  infrastructure instead of generalizing it is merely icing on the cake,
  but it's still moving in the wrong direction.
 
  You're just as much to blame for not writing a general json abstraction
  layer for EXPLAIN. I'd say that's not much blame, but still, there's
  really not much difference there.
 
 Well, this patch is series is at least an order of magnitude larger,
 and it's apparently doing significantly more complex stuff with JSON,
 because the explain.c patch just writes it out into a StringInfo.

And this code builds up the data in memory and then calls into the json
code to build the json value. And to parse json it uses the functions
underlying the SQL accessors.

There's one function it should either not need anymore (dequote_jsonval)
by using json_object_field_text instead of json_object_field or by
exposing dequote_jsonval's functionality from json.c.

I think the json angle here is a red herring. Sure, a nicer API could
save some FunctionCallInfoData boilerplate, but that's pretty much it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to 

Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Robert Haas
On Sat, Nov 8, 2014 at 1:05 PM, Andres Freund and...@2ndquadrant.com wrote:
 There's nothing to keep you from exposing the parse trees to
 C functions that can live in an extension, and you can do all of this
 deparsing there.

 Not really. There's some core functions that need to be touched. Like
 most of the stuff in patches 1,2,3,5,16 does.

Patch 1 is fine.  We've done similar things in the past (cf.
c504513f83a9ee8dce4a719746ca73102cae9f13,
82b1b213cad3a69cf5f3dfaa81687c14366960fc).  I'd just commit that.

Patch 2 adds support for GRANT and REVOKE to the event trigger
mechanism.  I wonder if it's a bad idea to make the
ddl_command_start/end events fire for DCL.  We discussed a lot of
these issues when this patch originally went in, and I think it'd be
worth revisiting that discussion.

Patch 3 is the same kind of idea as patch 2, only for COMMENT.

Patch 5 depends on patch 4, which does a bunch of things.  I *think*
the upshot of patch 5 is that we're not currently firing event
triggers in some situations where we should, in which case +1 for
fixing that.  It would help if there were a real commit message,
and/or some more contents, and I think it could be more completely
disentangled from patch 4.

Patch 16 again contains almost no comments and no description of its
specific purpose, but it appears to be similar to patch 1, so probably
mostly uncontroversial.

 We could just integrate those parts, and be done with it. But would that
 actually be a good thing for the community? Then slony needs to do it
 and potentially others as well? Then auditing can't use it? Then
 potential schema tracking solutions can't use it?

Do you think Slony is really going to use this?  I guess we can let
the Slony guys speak for themselves, but I've been skeptical since day
one that this is the best way to do DDL replication, and I still am.
There are lots of ways that a replicated DDL statement can fail on the
replicas, and what are you going to do then?  It's too late to show
the user the error message, so you can throw it in a log someplace and
hope that somebody notices, but that's it.  It makes a lot more sense
to me to use some kind of a tool that applies the DDL in a coordinated
fashion on all nodes - or even just do it manually, since it might
very well be desirable to take the lock on different nodes at widely
different times, separated by a switchover.  I certainly think there's
a use-case for what you're trying to do here, but I don't think it'll
be right for everyone.

Certainly, if the Slony guys - or some other team building an
out-of-core replication solutions says, hey, we really want this in
core, that would considerably strengthen the argument for putting it
there.  But I haven't heard anyone say that yet - unlike logical
decoding, were we did have other people expressing clear interest in
using it.

 There've been people for a long while asking about triggers on catalogs
 for that purpose. IIRC Jan was one of them.

My impression, based on something Christopher Brown said a few years
ago, is that Slony's DDL trigger needs are largely satisfied by the
existing event trigger stuff.  It would be helpful to get confirmation
as to whether that's the case.

 On the flip side, the *cost* of pushing it into core is that
 it now becomes the PostgreSQL's community problem to update the
 deparsing code every time the grammar changes.  That cost-benefit
 trade-off does not look favorable to me, especially given the absence
 of any kind of robust testing framework.

 I agree that there should be testing in this.

 But I think you're quite overstating the effort of maintaining
 this. Looking at gram.y between the stamping of 9.3 devel and 9.4 devel
 for commits that'd require DDL deparsing changes:
 * ALTER TABLE .. ALTER CONSTRAINT: About 7 lines for the deparse code,
   out of a ~350 line patch
 * REFRESH MATERIALIZED VIEW ... CONCURRENTLY: About 1 line for deparse,
   out of ~700
 * WITH CHECK OPTION for views: About 3 lines for deparse, out of ~1300
 * REPLICA IDENTITY: About 24 lines for deparse, out of ~950
 * ALTER TABLESPACE MOVE: About 20 lines for deparse, out of
   ~340. Although that patch was essentially scrapped afterwards. And
   rewritten differently.
 * CREATE TABLESPACE ... WITH ...:  Not supported by event triggers right
   now as it's a global object.

 That's really not a whole lot.

Those are pretty minor syntax patches, though.  I think it's more
helpful to look at things like the row-level security stuff, or
materialized views per se, or DDL support for collations.  In any
case, the additional coding burden concerns me less than the
additional testing burden - but there's another email further down
that's more to that specific point, so I'll stop here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Andres Freund
On 2014-11-08 17:49:30 -0500, Robert Haas wrote:
 Patch 2 adds support for GRANT and REVOKE to the event trigger
 mechanism.  I wonder if it's a bad idea to make the
 ddl_command_start/end events fire for DCL.  We discussed a lot of
 these issues when this patch originally went in, and I think it'd be
 worth revisiting that discussion.

Well, it doesn't generally support it for all GRANT statement, but just
for ones that are database local. I think that mostly skirts the
problems from the last round of discussion. But I only vaguely remember
them.

 Patch 5 depends on patch 4, which does a bunch of things.  I *think*
 the upshot of patch 5 is that we're not currently firing event
 triggers in some situations where we should, in which case +1 for
 fixing that.  It would help if there were a real commit message,
 and/or some more contents, and I think it could be more completely
 disentangled from patch 4.

  We could just integrate those parts, and be done with it. But would that
  actually be a good thing for the community? Then slony needs to do it
  and potentially others as well? Then auditing can't use it? Then
  potential schema tracking solutions can't use it?
 
 Do you think Slony is really going to use this?

There was a fair amount noise about it at one of the past cluster
hackers thingies.

 I guess we can let
 the Slony guys speak for themselves, but I've been skeptical since day
 one that this is the best way to do DDL replication, and I still am.

Well, I've yet to hear anything that's realistic otherwise.

 There are lots of ways that a replicated DDL statement can fail on the
 replicas, and what are you going to do then?  It's too late to show
 the user the error message, so you can throw it in a log someplace and
 hope that somebody notices, but that's it.

Sure. And absolutely the same is true for DML. And the lack of DDL
integration makes it happen really rather frequently...

 It makes a lot more sense
 to me to use some kind of a tool that applies the DDL in a coordinated
 fashion on all nodes - or even just do it manually, since it might
 very well be desirable to take the lock on different nodes at widely
 different times, separated by a switchover.  I certainly think there's
 a use-case for what you're trying to do here, but I don't think it'll
 be right for everyone.

I agree that it's not the right.

 Certainly, if the Slony guys - or some other team building an
 out-of-core replication solutions says, hey, we really want this in
 core, that would considerably strengthen the argument for putting it
 there.  But I haven't heard anyone say that yet - unlike logical
 decoding, were we did have other people expressing clear interest in
 using it.

As I said, there was clear interest at at least two of the cluster
hackers meetings...

  There've been people for a long while asking about triggers on catalogs
  for that purpose. IIRC Jan was one of them.
 
 My impression, based on something Christopher Brown said a few years
 ago, is that Slony's DDL trigger needs are largely satisfied by the
 existing event trigger stuff.  It would be helpful to get confirmation
 as to whether that's the case.

Oh, that's contrary to what I remember, but yes, it'd be interesting to
hear about htat.

  On the flip side, the *cost* of pushing it into core is that
  it now becomes the PostgreSQL's community problem to update the
  deparsing code every time the grammar changes.  That cost-benefit
  trade-off does not look favorable to me, especially given the absence
  of any kind of robust testing framework.
 
  I agree that there should be testing in this.
 
  But I think you're quite overstating the effort of maintaining
  this. Looking at gram.y between the stamping of 9.3 devel and 9.4 devel
  for commits that'd require DDL deparsing changes:
  * ALTER TABLE .. ALTER CONSTRAINT: About 7 lines for the deparse code,
out of a ~350 line patch
  * REFRESH MATERIALIZED VIEW ... CONCURRENTLY: About 1 line for deparse,
out of ~700
  * WITH CHECK OPTION for views: About 3 lines for deparse, out of ~1300
  * REPLICA IDENTITY: About 24 lines for deparse, out of ~950
  * ALTER TABLESPACE MOVE: About 20 lines for deparse, out of
~340. Although that patch was essentially scrapped afterwards. And
rewritten differently.
  * CREATE TABLESPACE ... WITH ...:  Not supported by event triggers right
now as it's a global object.
 
  That's really not a whole lot.
 
 Those are pretty minor syntax patches, though.

Sure, but these are all the ones from the stamping of 9.3devel to
9.4devel. I wanted to look at a release cycle, and that seemed the
easiest way to do it. I didn't choose that cycle, because I knew it was
light on ddl, I chose it because it was the last complete one.

 I think it's more helpful to look at things like the row-level
 security stuff, or materialized views per se, or DDL support for
 collations.

I can't imagine that writing the relatively straight forward stuff that
deparsing 

Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Robert Haas
On Sat, Nov 8, 2014 at 4:35 PM, Andres Freund and...@2ndquadrant.com wrote:
  I don't understand why this is particularly difficult to regresssion
  test. It actually is comparatively simple?

 If it is, great.  I previously wrote this email:

 http://www.postgresql.org/message-id/CA+TgmoZ=vzrijmxlkqi_v0jg4k4leapmwusc6rwxs5mquxu...@mail.gmail.com

 Alvaro came up with a way of addressing the second point I raised
 there, which I'm quite pleased about, but AFAIK there's been no
 progress on the first one.  Maybe I missed something?

 I unfortunately don't think so. And that sounds like a completely
 reasonable criticism.

I'm glad you agree.  If you can find a way to address that point, I
can live with the rest of it.  I don't think it's dumb to be concerned
about features that increase the cost of adding more features.  But
what really concerns me is that that code won't be well-tested, and if
there are cases missing or somebody forgets to do it altogether, it's
very likely that we won't notice.  That seems like a huge problem from
where I sit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-08 Thread Robert Haas
On Sat, Nov 8, 2014 at 6:24 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-11-08 17:49:30 -0500, Robert Haas wrote:
 Patch 2 adds support for GRANT and REVOKE to the event trigger
 mechanism.  I wonder if it's a bad idea to make the
 ddl_command_start/end events fire for DCL.  We discussed a lot of
 these issues when this patch originally went in, and I think it'd be
 worth revisiting that discussion.

 Well, it doesn't generally support it for all GRANT statement, but just
 for ones that are database local. I think that mostly skirts the
 problems from the last round of discussion. But I only vaguely remember
 them.

The issue I was alluding to was terminological: it's not clear that
GRANT and REVOKE should be called DDL rather than DCL, although we do
have precedent in some of the logging settings.

The other issue I remember is that if you have a separate event
trigger for GRANT/REVOKE, you can expose fields like operation/object
type/permission/target.  That's much harder to do for an event trigger
that's very broad.

But anyway, I think it would be worth going back and looking at the
previous discussion.

 As I said, there was clear interest at at least two of the cluster
 hackers meetings...

We need to get some better data here.

But again, my core concern is that we have no good way to test this
code for bugs, including of omission, without which I think we will be
very sad.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-07 Thread Alvaro Herrera
Michael Paquier wrote:

 Here are more thoughts among those lines looking at the current state of
 the patch 4 that introduces the infrastructure of the whole feature. This
 would make possible in-memory manipulation of jsonb containers without
 relying on a 3rd-part set of APIs like what this patch is doing with
 ObjTree to deparse the DDL parse trees.

Thanks for the thoughts.  I have to say that I have no intention of
reworking the jsonb code.  If somebody else wants to do the legwork and
add that API as you suggest, I'm happy to remove all the ObjTree stuff
from this patch.  I don't expect this to happen too soon, though, so I
would instead consider committing this patch based on ObjTree.  Later,
when somebody comes around to reworking jsonb, we can rip ObjTree out.

 This infrastructure would allow in-memory manipulation of jsonb containers.
 Containers that can then be easily be manipulated to be changed back to
 strings and for value lookups using key strings.

Honestly, I had hoped that the jsonb code would have already included
this kind of functionality.  I wasn't too happy when I discovered that I
needed to keep the ObjTree crap.  But I don't want to do that work
myself.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-07 Thread Robert Haas
On Fri, Nov 7, 2014 at 10:45 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Michael Paquier wrote:

 Here are more thoughts among those lines looking at the current state of
 the patch 4 that introduces the infrastructure of the whole feature. This
 would make possible in-memory manipulation of jsonb containers without
 relying on a 3rd-part set of APIs like what this patch is doing with
 ObjTree to deparse the DDL parse trees.

 Thanks for the thoughts.  I have to say that I have no intention of
 reworking the jsonb code.  If somebody else wants to do the legwork and
 add that API as you suggest, I'm happy to remove all the ObjTree stuff
 from this patch.  I don't expect this to happen too soon, though, so I
 would instead consider committing this patch based on ObjTree.  Later,
 when somebody comes around to reworking jsonb, we can rip ObjTree out.

 This infrastructure would allow in-memory manipulation of jsonb containers.
 Containers that can then be easily be manipulated to be changed back to
 strings and for value lookups using key strings.

 Honestly, I had hoped that the jsonb code would have already included
 this kind of functionality.  I wasn't too happy when I discovered that I
 needed to keep the ObjTree crap.  But I don't want to do that work
 myself.

If we're going to have infrastructure for this in core, we really
ought to make the effort to make it general instead of not.

I still think this whole idea is a mess.  It adds what looks to be a
LOT of code that, at least as I understand it, we have no compelling
way to regression test and which will likely receive very limited
testing from users to support a feature that is not in core, may never
be, and which I strongly suspect may be too clever by half.  Once
you've committed it, you're going to move onto other things and leave
it to everyone who comes after to try to maintain it.  I bet we'll
still be running into bugs and half-implemented features five years
from now, and maybe in ten.  Ramming through special-purpose
infrastructure instead of generalizing it is merely icing on the cake,
but it's still moving in the wrong direction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-11-04 Thread Michael Paquier
On Fri, Oct 31, 2014 at 11:27 AM, Michael Paquier michael.paqu...@gmail.com
 wrote:

 So, what I think is missing is really a friendly interface to manipulate
 JsonbContainers directly, and I think that we are not far from it with
 something like this set, roughly:
 - Initialization of an empty container
 - Set of APIs to directly push a value to a container (boolean, array,
 null, string, numeric or other jsonb object)
 - Initialization of JsonbValue objects

Here are more thoughts among those lines looking at the current state of
the patch 4 that introduces the infrastructure of the whole feature. This
would make possible in-memory manipulation of jsonb containers without
relying on a 3rd-part set of APIs like what this patch is doing with
ObjTree to deparse the DDL parse trees.
1) Set of constructor functions for JsonbValue: null, bool, string, array,
JSONB object for nested values. Note that keys for can be used as Jsonb
string objects
2) Lookup functions for values in a JsonbContainer. Patch 4 is introducing
that with find_string_in_jsonbcontainer and find_bool_in_jsonbcontainer. We
may as well extend it to be able to look for another Jsonb object for
nested searches for example.
3) Functions to push JsonbValue within a container, using a key and a
value. This is where most of the work would be necessary, for bool, null,
string, Jsonb object and numeric.

This infrastructure would allow in-memory manipulation of jsonb containers.
Containers that can then be easily be manipulated to be changed back to
strings and for value lookups using key strings.
Regards,
--
Michael


Re: [HACKERS] Add CREATE support to event triggers

2014-10-30 Thread Michael Paquier
On Thu, Oct 30, 2014 at 2:40 AM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Oct 28, 2014 at 6:00 AM, Andres Freund and...@2ndquadrant.com
 wrote:
  Uhm.  Obviously we didn't have jsonb when I started this and we do have
  them now, so I could perhaps see about updating the patch to do things
  this way; but I'm not totally sold on that idea, as my ObjTree stuff is
  a lot easier to manage and the jsonb API is pretty ugly.
 
  I looked at this as well, and I think trying to do so would not result
  in readable code.

 That doesn't speak very well of jsonb.  :-(


Just did the same and I played a bit with the APIs. And I am getting the
impression that the jsonb API is currently focused on the fact of deparsing
and parsing Jsonb strings to/from containers but there is no real interface
that allows to easily manipulate the containers where the values are
located. So, what I think is missing is really a friendly interface to
manipulate JsonbContainers directly, and I think that we are not far from
it with something like this set, roughly:
- Initialization of an empty container
- Set of APIs to directly push a value to a container (boolean, array,
null, string, numeric or other jsonb object)
- Initialization of JsonbValue objects
With this basic set of APIs patch 4 could for example use JsonbToCString to
then convert the JSONB bucket back to a string it sends to client. Note as
well that there is already findJsonbValueFromContainer present to get back
a value in a container.

In short, my point is: instead of re-creating the wheel like what this
series of patch is trying to do with ObjTree, I think that it would be more
fruitful to have a more solid in-place JSONB infrastructure that allows to
directly manipulate JSONB objects. This feature as well as future
extensions could benefit from that.
Feel free to comment.
Regards,
-- 
Michael


Re: [HACKERS] Add CREATE support to event triggers

2014-10-28 Thread Michael Paquier
On Thu, Oct 16, 2014 at 4:01 PM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Mon, Oct 13, 2014 at 12:45 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Here's a new version of this series.

 Here is some input on patch 4:

1) Use of OBJECT_ATTRIBUTE:
+   case OBJECT_ATTRIBUTE:
+   catalog_id = TypeRelationId;/* XXX? */
+   break;
I think that the XXX mark could be removed, using TypeRelationId is correct
IMO as OBJECT ATTRIBUTE is used when managing an attribute object, which is
a custom type used for CREATE/ALTER TYPE.
2) This patch is showing up many warnings, among them:
event_trigger.c:1460:20: note: uninitialized use occurs here
addr.classId = classId;
   ^~~
event_trigger.c:1446:21: note: initialize the variable 'objSubId' to
silence this warning
uint32  objSubId;
^
 = 0
Or:
deparse_utility.c:301:1: warning: unused function 'append_object_object'
[-Wunused-function]
append_object_object(ObjTree *tree, char *name, ObjTree *value)
^
In the 2nd case though I imagine that those functions in deparse_utility.c
are used afterwards... There are a couple of other warning related to
SCT_Simple but that's normal as long as 17 or 24 are not combined with it.
3) What's that actually?
+/* XXX merge this with ObjectTypeMap? */
 static event_trigger_support_data event_trigger_support[] = {
We may be able to do something smarter with event_trigger_support[], but as
this consists in a mapping of subcommands associated with CREATE/DROP/ALTER
and is only a reverse engineering operation of somewhat
AlterObjectTypeCommandTag  or CreateCommandTag, I am not sure how you could
merge that. Some input perhaps?
4)
+/*
+ * EventTriggerStashCommand
+ * Save data about a simple DDL command that was just executed
+ */
Shouldn't this be single instead of simple?
5) I think that SCT_Simple should be renamed as SCT_Single
+typedef enum StashedCommandType
+{
+   SCT_Simple,
+} StashedCommandType;
This comment holds as well for deparse_simple_command.
6)
+   command = deparse_utility_command(cmd);
+
+   /*
+* Some parse trees return NULL when deparse is attempted;
we don't
+* emit anything for them.
+*/
+   if (command != NULL)
+   {
Small detail, but you may here just use a continue to make the code a bit
more readable after deparsing the command.
7) pg_event_trigger_get_creation_commands is modified as well in patches 17
and 24. You may as well use an enum on cmd-type.
8) Rejoining a comment done earlier by Andres, it would be nice to have
ERRCODE_WRONG_CONTEXT (unrelated to this patch).
ERRCODE_FEATURE_NOT_SUPPORTED seems rather a different error type...
9) Those declarations are not needed in event_trigger.c:
+#include utils/json.h
+#include utils/jsonb.h
10) Would you mind explaining what means fmt?
+ * Allocate a new object tree to store parameter values -- varargs version.
+ *
+ * The fmt argument is used to append as a fmt element in the output
blob.
11) deparse_utility.c mentions here and there JSON objects, but what is
created are JSONB objects. I'd rather be clear here.
12) Already mentioned before, but this reverse engineering machine for
types would be more welcome as a set of functions in lsyscache (one for
namespace, one for type name and one for is_array). For typemodstr the need
is different as it uses printTypmod.
+void
+format_type_detailed(Oid type_oid, int32 typemod,
+Oid *nspid, char **typname, char
**typemodstr,
+bool *is_array)

13) This change seems unrelated to this patch...
-   int type = 0;
+   JsonbIteratorToken type = WJB_DONE;
JsonbValue  v;
int level = 0;
boolredo_switch = false;
@@ -454,7 +454,7 @@ JsonbToCString(StringInfo out, JsonbContainer *in, int
estimated_len)
first = false;
break;
default:
-   elog(ERROR, unknown flag of jsonb
iterator);
+   elog(ERROR, unknown jsonb iterator token
type);
14) This could already be pushed as a separate commit:
-extern bool creating_extension;
+extern PGDLLIMPORT bool creating_extension;

A couple of comments: this patch introduces a basic infrastructure able to
do the following set of operations:
- Obtention of parse tree using StashedCommand
- Reorganization of parse tree to become an ObjTree, with boolean, array
- Reorganization of ObjTree to a JsonB value
I am actually a bit doubtful about why we actually need this intermediate
ObjTree step... 

Re: [HACKERS] Add CREATE support to event triggers

2014-10-28 Thread Alvaro Herrera
Hi Michael, thanks for the review.

Michael Paquier wrote:
 On Thu, Oct 16, 2014 at 4:01 PM, Michael Paquier michael.paqu...@gmail.com
 wrote:
 
  On Mon, Oct 13, 2014 at 12:45 PM, Alvaro Herrera
  alvhe...@2ndquadrant.com wrote:
   Here's a new version of this series.
 
  Here is some input on patch 4:
 
 1) Use of OBJECT_ATTRIBUTE:
 +   case OBJECT_ATTRIBUTE:
 +   catalog_id = TypeRelationId;/* XXX? */
 +   break;
 I think that the XXX mark could be removed, using TypeRelationId is correct
 IMO as OBJECT ATTRIBUTE is used when managing an attribute object, which is
 a custom type used for CREATE/ALTER TYPE.

Agreed.

 2) This patch is showing up many warnings, among them:

Will check.

 3) What's that actually?
 +/* XXX merge this with ObjectTypeMap? */
  static event_trigger_support_data event_trigger_support[] = {
 We may be able to do something smarter with event_trigger_support[], but as
 this consists in a mapping of subcommands associated with CREATE/DROP/ALTER
 and is only a reverse engineering operation of somewhat
 AlterObjectTypeCommandTag  or CreateCommandTag, I am not sure how you could
 merge that. Some input perhaps?

ObjectTypeMap is part of the patch that handles DROP for replication;
see my other patch in commitfest.  I am also not sure how to merge all
this stuff; with these patches, we would have three do event triggers
support this object type functions, so I was thinking in having maybe a
text file from which these functions are generated from a perl script or
something.  But for now I think it's okay to keep things like this.
That comment is only there to remind me that some cleanup might be in
order.

 4)
 +/*
 + * EventTriggerStashCommand
 + * Save data about a simple DDL command that was just executed
 + */
 Shouldn't this be single instead of simple?

In an older version it was basic.  Not wedded to simple, but I don't
think single is the right thing.  A later patch in the series
introduces type Grant, and there's also type AlterTable.  The idea
behind Simple is to include command types that do not require special
handling; but all these commands are single commands.

 6)
 +   command = deparse_utility_command(cmd);
 +
 +   /*
 +* Some parse trees return NULL when deparse is attempted;
 we don't
 +* emit anything for them.
 +*/
 +   if (command != NULL)
 +   {
 Small detail, but you may here just use a continue to make the code a bit
 more readable after deparsing the command.

Will check.

 9) Those declarations are not needed in event_trigger.c:
 +#include utils/json.h
 +#include utils/jsonb.h

Will check. I split ddl_json.c at the last minute and I may have
forgotten to remove these.

 10) Would you mind explaining what means fmt?
 + * Allocate a new object tree to store parameter values -- varargs version.
 + *
 + * The fmt argument is used to append as a fmt element in the output
 blob.

fmt is equivalent to sprintf and friends' fmt argument.  I guess this
commands needs to be expanded a bit.

 11) deparse_utility.c mentions here and there JSON objects, but what is
 created are JSONB objects. I'd rather be clear here.

Good point.

 12) Already mentioned before, but this reverse engineering machine for
 types would be more welcome as a set of functions in lsyscache (one for
 namespace, one for type name and one for is_array). For typemodstr the need
 is different as it uses printTypmod.
 +void
 +format_type_detailed(Oid type_oid, int32 typemod,
 +Oid *nspid, char **typname, char
 **typemodstr,
 +bool *is_array)

I am unsure about this.  Other things that require many properties of
the same object do a single lookup and return all of them in a single
call, rather than repeated calls.

 13) This change seems unrelated to this patch...
 -   int type = 0;
 +   JsonbIteratorToken type = WJB_DONE;
 JsonbValue  v;
 int level = 0;
 boolredo_switch = false;
 @@ -454,7 +454,7 @@ JsonbToCString(StringInfo out, JsonbContainer *in, int
 estimated_len)
 first = false;
 break;
 default:
 -   elog(ERROR, unknown flag of jsonb
 iterator);
 +   elog(ERROR, unknown jsonb iterator token
 type);

Yes, sorry.  I was trying to figure out how to use the jsonb stuff and
I found this error message was quite unclear.  In general, jsonb code
seems to have random warts ...

 A couple of comments: this patch introduces a basic infrastructure able to
 do the following set of operations:
 - Obtention of parse tree using StashedCommand
 - Reorganization of parse tree to become an ObjTree, with boolean, array
 - Reorganization of ObjTree 

Re: [HACKERS] Add CREATE support to event triggers

2014-10-28 Thread Andres Freund
On 2014-10-28 05:30:43 -0300, Alvaro Herrera wrote:
  A couple of comments: this patch introduces a basic infrastructure able to
  do the following set of operations:
  - Obtention of parse tree using StashedCommand
  - Reorganization of parse tree to become an ObjTree, with boolean, array
  - Reorganization of ObjTree to a JsonB value
  I am actually a bit doubtful about why we actually need this intermediate
  ObjTree step... What would be wrong in manipulating a JSON(B) parsed tree,
  that is first empty, and pushing key/value objects to it when processing
  each command? Something moving toward in this direction is that ObjTree has
  some logic to manipulate booleans, null values, etc. This results in some
  duplication with what jsonb and json can actually do when creating when
  manipulating strings, as well as the extra processing like
  objtree_to_jsonb_element. ddl_json.c (should be ddl_jsonb.c?) would as well
  take more its sense as it directly manipulates JSONB containers.
 
 Uhm.  Obviously we didn't have jsonb when I started this and we do have
 them now, so I could perhaps see about updating the patch to do things
 this way; but I'm not totally sold on that idea, as my ObjTree stuff is
 a lot easier to manage and the jsonb API is pretty ugly.

I looked at this as well, and I think trying to do so would not result
in readable code.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-10-16 Thread Michael Paquier
On Mon, Oct 13, 2014 at 12:45 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Here's a new version of this series.  The main change is that I've
 changed deparse_utility.c to generate JSON, and the code that was in
 commands/event_trigger.c to decode that JSON, so that it uses the new
 Jsonb API instead.  In addition, I've moved the new code that was in
 commands/event_trigger.c to utils/adt/ddl_json.c.  (The only entry point
 of the new file is the SQL-callable pg_event_trigger_expand_command()
 function, and its purpose is to expand a JSON object emitted by the
 deparse_utility.c code back into a plain text SQL command.)
 I have also cleaned up the code per comments from Michael Paquier and
 Andres Freund:

 * the GRANT support for event triggers now correctly ignores global
 objects.

 * COMMENT ON .. IS NULL no longer causes a crash
Why would this not fail? Patch 3 in this set is identical to the last
one. I tested that and it worked well...

 * renameatt() and ExecRenameStmt are consistent in returning the
 objSubId as an int (not int32).  This is what is used as objectSubId
 in ObjectAddress, which is what we're using this value for.

In patch 1, ExecRenameStmt returns objsubid for an attribute name when
rename is done on an attribute. Now could you clarify why we skip this
list of objects even if their sub-object ID is available with
address.objectSubId?
case OBJECT_AGGREGATE:
case OBJECT_COLLATION:
case OBJECT_CONVERSION:
case OBJECT_EVENT_TRIGGER:
case OBJECT_FDW:
case OBJECT_FOREIGN_SERVER:
case OBJECT_FUNCTION:
case OBJECT_OPCLASS:
case OBJECT_OPFAMILY:
case OBJECT_LANGUAGE:
case OBJECT_TSCONFIGURATION:
case OBJECT_TSDICTIONARY:
case OBJECT_TSPARSER:
case OBJECT_TSTEMPLATE:

Patch 2 fails on make check for the tests privileges and foreign_data
by showing up double amount warnings for some object types:
*** /Users/ioltas/git/postgres/src/test/regress/expected/privileges.out
Fri Oct 10 14:34:10 2014
--- /Users/ioltas/git/postgres/src/test/regress/results/privileges.out
 Thu Oct 16 15:47:42 2014
***
*** 118,123 
--- 118,124 
  ERROR:  permission denied for relation atest2
  GRANT ALL ON atest1 TO PUBLIC; -- fail
  WARNING:  no privileges were granted for atest1
+ WARNING:  no privileges were granted for atest1
  -- checks in subquery, both ok
  SELECT * FROM atest1 WHERE ( b IN ( SELECT col1 FROM atest2 ) );
   a | b
EventTriggerSupportsGrantObjectType is fine to remove
ACL_OBJECT_DATABASE and ACL_OBJECT_TABLESPACE from the list of
supported objects. That's as well in line with the current firing
matrix. I think that it would be appropriate to add a comment on top
of this function.

Patch 3 looks good.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-10-11 Thread Michael Paquier
On Thu, Oct 9, 2014 at 8:29 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Oct 9, 2014 at 6:26 AM, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:
 However, if I were to do it, I would instead create a quote.h file and
 would also add the quote_literal_cstr() prototype to it, perhaps even
 move the implementations from their current ruleutils.c location to
 quote.c.  (Why is half the stuff in ruleutils.c rather than quote.c is
 beyond me.)

 Yes, an extra header file would be cleaner. Now if we are worried about
 creating much incompatibilities with existing extension (IMO that's not
 something core should worry much about), then it is better not to do it.
 Now, if I can vote on it, I think it is worth doing in order to have
 builtins.h only contain only Datum foo(PG_FUNCTION_ARGS), and move all the
 quote-related things in quote.c.
The attached patch implements this idea, creating a new header quote.h
in include/utils that groups all the quote-related functions. This
makes the code more organized.
Regards,
-- 
Michael
From b14000662a52c3f5b40cf43a1f6699e0d024d1fd Mon Sep 17 00:00:00 2001
From: Michael Paquier mich...@otacoo.com
Date: Sat, 11 Oct 2014 22:28:05 +0900
Subject: [PATCH] Move all quote-related functions into a single header quote.h

Note that this impacts as well contrib modules, and mainly external
modules particularly for quote_identifier.
---
 contrib/dblink/dblink.c   |   1 +
 contrib/postgres_fdw/deparse.c|   1 +
 contrib/postgres_fdw/postgres_fdw.c   |   1 +
 contrib/test_decoding/test_decoding.c |   1 +
 contrib/worker_spi/worker_spi.c   |   1 +
 src/backend/catalog/namespace.c   |   1 +
 src/backend/catalog/objectaddress.c   |   1 +
 src/backend/commands/explain.c|   1 +
 src/backend/commands/extension.c  |   1 +
 src/backend/commands/matview.c|   1 +
 src/backend/commands/trigger.c|   1 +
 src/backend/commands/tsearchcmds.c|   1 +
 src/backend/parser/parse_utilcmd.c|   1 +
 src/backend/utils/adt/format_type.c   |   1 +
 src/backend/utils/adt/quote.c | 110 ++
 src/backend/utils/adt/regproc.c   |   1 +
 src/backend/utils/adt/ri_triggers.c   |   1 +
 src/backend/utils/adt/ruleutils.c | 109 +
 src/backend/utils/adt/varlena.c   |   1 +
 src/backend/utils/cache/ts_cache.c|   1 +
 src/backend/utils/misc/guc.c  |   1 +
 src/include/utils/builtins.h  |   6 --
 src/include/utils/quote.h |  28 +
 src/pl/plpgsql/src/pl_gram.y  |   1 +
 src/pl/plpython/plpy_plpymodule.c |   1 +
 25 files changed, 160 insertions(+), 114 deletions(-)
 create mode 100644 src/include/utils/quote.h

diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index d77b3ee..cbbc513 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -56,6 +56,7 @@
 #include utils/guc.h
 #include utils/lsyscache.h
 #include utils/memutils.h
+#include utils/quote.h
 #include utils/rel.h
 #include utils/tqual.h
 
diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index 2045774..0009cd3 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -50,6 +50,7 @@
 #include parser/parsetree.h
 #include utils/builtins.h
 #include utils/lsyscache.h
+#include utils/quote.h
 #include utils/syscache.h
 
 
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 4c49776..feb41f5 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -36,6 +36,7 @@
 #include utils/guc.h
 #include utils/lsyscache.h
 #include utils/memutils.h
+#include utils/quote.h
 
 PG_MODULE_MAGIC;
 
diff --git a/contrib/test_decoding/test_decoding.c b/contrib/test_decoding/test_decoding.c
index fdbd313..b168fc3 100644
--- a/contrib/test_decoding/test_decoding.c
+++ b/contrib/test_decoding/test_decoding.c
@@ -25,6 +25,7 @@
 #include utils/builtins.h
 #include utils/lsyscache.h
 #include utils/memutils.h
+#include utils/quote.h
 #include utils/rel.h
 #include utils/relcache.h
 #include utils/syscache.h
diff --git a/contrib/worker_spi/worker_spi.c b/contrib/worker_spi/worker_spi.c
index 328c722..7ada98d 100644
--- a/contrib/worker_spi/worker_spi.c
+++ b/contrib/worker_spi/worker_spi.c
@@ -38,6 +38,7 @@
 #include lib/stringinfo.h
 #include pgstat.h
 #include utils/builtins.h
+#include utils/quote.h
 #include utils/snapmgr.h
 #include tcop/utility.h
 
diff --git a/src/backend/catalog/namespace.c b/src/backend/catalog/namespace.c
index 5eb8fd4..7f8f54b 100644
--- a/src/backend/catalog/namespace.c
+++ b/src/backend/catalog/namespace.c
@@ -53,6 +53,7 @@
 #include utils/inval.h
 #include utils/lsyscache.h
 #include utils/memutils.h
+#include utils/quote.h
 #include utils/syscache.h
 
 
diff --git a/src/backend/catalog/objectaddress.c b/src/backend/catalog/objectaddress.c
index b69b75b..d7a1ec7 100644
--- 

Re: [HACKERS] Add CREATE support to event triggers

2014-10-08 Thread Alvaro Herrera
About patch 1, which I just pushed, there were two reviews:

Andres Freund wrote:
 On 2014-09-25 18:59:31 -0300, Alvaro Herrera wrote:
  diff --git a/src/include/utils/ruleutils.h b/src/include/utils/ruleutils.h
  new file mode 100644
  index 000..520b066
  --- /dev/null
  +++ b/src/include/utils/ruleutils.h

 I wondered for a minute whether any of these are likely to cause
 problems for code just including builtins.h - but I think there will be
 sufficiently few callers for them to make that not much of a concern.

Great.

Michael Paquier wrote:

 Patch 1: I still like this patch as it gives a clear separation of the
 built-in functions and the sub-functions of ruleutils.c that are
 completely independent. Have you considered adding the external
 declaration of quote_all_identifiers as well? It is true that this
 impacts extensions (some of my stuff as well), but my point is to bite
 the bullet and make the separation cleaner between builtins.h and
 ruleutils.h. Attached is a patch that can be applied on top of patch 1
 doing so... Feel free to discard for the potential breakage this would
 create though.

Yes, I did consider moving the quoting function definitions and the
global out of builtins.h.  It is much more disruptive, however, because
it affects a lot of external code not only our own; and given the looks
I got after people had to mess with adding the htup_details.h header to
external projects due to the refactoring I did to htup.h in an earlier
release, I'm not sure about it.

However, if I were to do it, I would instead create a quote.h file and
would also add the quote_literal_cstr() prototype to it, perhaps even
move the implementations from their current ruleutils.c location to
quote.c.  (Why is half the stuff in ruleutils.c rather than quote.c is
beyond me.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-10-08 Thread Michael Paquier
On Thu, Oct 9, 2014 at 6:26 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 About patch 1, which I just pushed, there were two reviews:

 Andres Freund wrote:
  On 2014-09-25 18:59:31 -0300, Alvaro Herrera wrote:
   diff --git a/src/include/utils/ruleutils.h
 b/src/include/utils/ruleutils.h
   new file mode 100644
   index 000..520b066
   --- /dev/null
   +++ b/src/include/utils/ruleutils.h

  I wondered for a minute whether any of these are likely to cause
  problems for code just including builtins.h - but I think there will be
  sufficiently few callers for them to make that not much of a concern.

 Great.

 Michael Paquier wrote:

  Patch 1: I still like this patch as it gives a clear separation of the
  built-in functions and the sub-functions of ruleutils.c that are
  completely independent. Have you considered adding the external
  declaration of quote_all_identifiers as well? It is true that this
  impacts extensions (some of my stuff as well), but my point is to bite
  the bullet and make the separation cleaner between builtins.h and
  ruleutils.h. Attached is a patch that can be applied on top of patch 1
  doing so... Feel free to discard for the potential breakage this would
  create though.

 Yes, I did consider moving the quoting function definitions and the
 global out of builtins.h.  It is much more disruptive, however, because
 it affects a lot of external code not only our own; and given the looks
 I got after people had to mess with adding the htup_details.h header to
 external projects due to the refactoring I did to htup.h in an earlier
 release, I'm not sure about it.

 However, if I were to do it, I would instead create a quote.h file and
 would also add the quote_literal_cstr() prototype to it, perhaps even
 move the implementations from their current ruleutils.c location to
 quote.c.  (Why is half the stuff in ruleutils.c rather than quote.c is
 beyond me.)

Yes, an extra header file would be cleaner. Now if we are worried about
creating much incompatibilities with existing extension (IMO that's not
something core should worry much about), then it is better not to do it.
Now, if I can vote on it, I think it is worth doing in order to have
builtins.h only contain only Datum foo(PG_FUNCTION_ARGS), and move all the
quote-related things in quote.c.
-- 
Michael


Re: [HACKERS] Add CREATE support to event triggers

2014-09-25 Thread Andres Freund
On 2014-09-25 18:59:31 -0300, Alvaro Herrera wrote:
  /* tid.c */
 diff --git a/src/include/utils/ruleutils.h b/src/include/utils/ruleutils.h
 new file mode 100644
 index 000..520b066
 --- /dev/null
 +++ b/src/include/utils/ruleutils.h
 @@ -0,0 +1,34 @@
 +/*-
 + *
 + * ruleutils.h
 + *   Declarations for ruleutils.c
 + *
 + * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
 + * Portions Copyright (c) 1994, Regents of the University of California
 + *
 + * src/include/ruleutils.h
 + *
 + *-
 + */
 +#ifndef RULEUTILS_H
 +#define RULEUTILS_H
 +
 +#include nodes/nodes.h
 +#include nodes/parsenodes.h
 +#include nodes/pg_list.h
 +
 +
 +extern char *pg_get_indexdef_string(Oid indexrelid);
 +extern char *pg_get_indexdef_columns(Oid indexrelid, bool pretty);
 +
 +extern char *pg_get_constraintdef_string(Oid constraintId);
 +extern char *deparse_expression(Node *expr, List *dpcontext,
 +bool forceprefix, bool showimplicit);
 +extern List *deparse_context_for(const char *aliasname, Oid relid);
 +extern List *deparse_context_for_planstate(Node *planstate, List *ancestors,
 +   List *rtable, List 
 *rtable_names);
 +extern List *select_rtable_names_for_explain(List *rtable,
 + Bitmapset 
 *rels_used);
 +extern char *generate_collation_name(Oid collid);
 +
 +#endif   /* RULEUTILS_H */
 -- 
 1.9.1
 

I wondered for a minute whether any of these are likely to cause
problems for code just including builtins.h - but I think there will be
sufficiently few callers for them to make that not much of a concern.

 From 5e3555058a1bbb93e25a3a104c26e6b96dce994d Mon Sep 17 00:00:00 2001
 From: Alvaro Herrera alvhe...@alvh.no-ip.org
 Date: Thu, 25 Sep 2014 16:34:50 -0300
 Subject: [PATCH 02/27] deparse/core: have RENAME return attribute number

Looks harmless. I.e. +1.

 From 72c4d0ef875f9e9b0f3b424499738fd1bd946c32 Mon Sep 17 00:00:00 2001
 From: Alvaro Herrera alvhe...@alvh.no-ip.org
 Date: Fri, 9 May 2014 18:32:23 -0400
 Subject: [PATCH 03/27] deparse/core: event triggers support GRANT/REVOKE

Hm. The docs don't mention whether these only work for database local
objects or also shared objects. Generally event trigger are only
triggered for the former... But the code doesn't seem to make a
difference?

 From 5bb35d2e51fe6c6e219fe5cf7ddbab5423e6be1b Mon Sep 17 00:00:00 2001
 From: Alvaro Herrera alvhe...@alvh.no-ip.org
 Date: Thu, 25 Sep 2014 15:44:44 -0300
 Subject: [PATCH 04/27] deparse/core: event triggers support COMMENT
 
 ---
  doc/src/sgml/event-trigger.sgml  |  8 +++-
  src/backend/commands/comment.c   |  5 -
  src/backend/commands/event_trigger.c |  1 +
  src/backend/tcop/utility.c   | 21 +
  src/include/commands/comment.h   |  2 +-
  5 files changed, 30 insertions(+), 7 deletions(-)
 
 diff --git a/doc/src/sgml/event-trigger.sgml b/doc/src/sgml/event-trigger.sgml
 index 49b8384..39ecd94 100644
 --- a/doc/src/sgml/event-trigger.sgml
 +++ b/doc/src/sgml/event-trigger.sgml
 @@ -37,7 +37,7 @@
 para
   The literalddl_command_start/ event occurs just before the
   execution of a literalCREATE/, literalALTER/, literalDROP/,
 - literalGRANT/ or literalREVOKE/
 + literalCOMMENT/, literalGRANT/ or literalREVOKE/
   command.  No check whether the affected object exists or doesn't exist 
 is
   performed before the event trigger fires.
   As an exception, however, this event does not occur for
 @@ -281,6 +281,12 @@
  entry align=centerliteral-/literal/entry
 /row
 row
 +entry align=leftliteralCOMMENT/literal/entry
 +entry align=centerliteralX/literal/entry
 +entry align=centerliteralX/literal/entry
 +entry align=centerliteral-/literal/entry
 +   /row
 +   row

Hm. No drop support because COMMENTs are odd and use odd NULL
semantics. Fair enough.

 From 098f5acabd774004dc5d9c750d55e7c9afa60238 Mon Sep 17 00:00:00 2001
 From: Alvaro Herrera alvhe...@alvh.no-ip.org
 Date: Wed, 24 Sep 2014 15:53:04 -0300
 Subject: [PATCH 05/27] deparse: infrastructure needed for command deparsing


 ---
  src/backend/catalog/objectaddress.c  | 115 +
  src/backend/commands/event_trigger.c | 941 
 ++-
  src/backend/tcop/Makefile|   2 +-
  src/backend/tcop/deparse_utility.c   | 877 
  src/backend/tcop/utility.c   |   2 +
  src/backend/utils/adt/format_type.c  | 113 -
  src/include/catalog/objectaddress.h  |   2 +
  src/include/catalog/pg_proc.h|   4 +
  src/include/commands/event_trigger.h |   3 +
  src/include/commands/extension.h |   2 +-
  src/include/nodes/parsenodes.h   |   2 +
  

Re: [HACKERS] Add CREATE support to event triggers

2014-09-25 Thread Michael Paquier
On Fri, Sep 26, 2014 at 6:59 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Actually here's a different split of these patches, which I hope makes
 more sense.  My intention here is that patches 0001 to 0004 are simple
 changes that can be pushed right away; they are not directly related to
 the return-creation-command feature.  Patches 0005 to 0027 implement
 that feature incrementally.  You can see in patch 0005 the DDL commands
 that are still not implemented in deparse (they are the ones that have
 an elog(ERROR) rather than a command = NULL).  Patch 0006 adds calls
 in ProcessUtilitySlow() to each command, so that the object(s) being
 touched are added to the event trigger command stash.

 Patches from 0007 to 0027 (excepting patch 0017) implement one or a
 small number of commands in deparse.  Patch 0017 is necessary
 infrastructure in ALTER TABLE to support deparsing that one.

 My intention with the later patches is that they would all be pushed as
 a single commit, i.e. the deparse support would be implemented for all
 commands in a fell swoop rather than piecemeal -- except possibly patch
 0017 (the ALTER TABLE infrastructure).  I split them up only for ease of
 review.  Of course, before pushing we (I) need to implement deparsing
 for all the remaining commands.

Thanks for the update. This stuff is still on my TODO list and I was
planning to look at it at some extent today btw :) Andres has already
sent some comments (20140925235724.gh16...@awork2.anarazel.de) though,
I'll try to not overlap with what has already been mentioned.

Patch 1: I still like this patch as it gives a clear separation of the
built-in functions and the sub-functions of ruleutils.c that are
completely independent. Have you considered adding the external
declaration of quote_all_identifiers as well? It is true that this
impacts extensions (some of my stuff as well), but my point is to bite
the bullet and make the separation cleaner between builtins.h and
ruleutils.h. Attached is a patch that can be applied on top of patch 1
doing so... Feel free to discard for the potential breakage this would
create though.

Patch 2:
1) Declarations of renameatt are inconsistent:
@tablecmds.c:
-renameatt(RenameStmt *stmt)
+renameatt(RenameStmt *stmt, int32 *objsubid)
@tablecmds.h:
-extern Oid renameatt(RenameStmt *stmt);
+extern Oid renameatt(RenameStmt *stmt, int *attnum);
Looking at this code, for correctness renameatt should use everywhere
int *attnum and ExecRenameStmt should use int32 *subobjid.
2) Just a smallish picky formatting remark, I would have done that on
a single line:
+   attnum =
+   renameatt_internal(relid,
3) in ExecRenameStmt, I think that you should set subobjid for
aggregate, collations, fdw, etc. There is an access to ObjectAddress
so that's easy to set using address.objectSubId.
4) This comment is misleading, as for an attribute what is returned is
actually an attribute number:
+ * Return value is the OID of the renamed object.  The objectSubId, if any,
+ * is returned in objsubid.
  */
5) Perhaps it is worth mentioning at the top of renameatt that the
attribute number is returned as well?

Patch 3: Looks fine, a bit of testing is showing up that this works as expected:
=# CREATE ROLE foo;
CREATE ROLE
=# CREATE TABLE aa (a int);
CREATE TABLE
=# CREATE OR REPLACE FUNCTION abort_grant()
RETURNS event_trigger AS $$
BEGIN
RAISE NOTICE 'Event: %', tg_event;
RAISE EXCEPTION 'Execution of command % forbidden', tg_tag;
END;
$$ LANGUAGE plpgsql;
CREATE FUNCTION
=# CREATE EVENT TRIGGER abort_grant_trigger ON ddl_command_start
WHEN TAG IN ('GRANT', 'REVOKE') EXECUTE PROCEDURE abort_grant();
CREATE EVENT TRIGGER
=# GRANT SELECT ON aa TO foo;
NOTICE:  0: Event: ddl_command_start
LOCATION:  exec_stmt_raise, pl_exec.c:3068
ERROR:  P0001: Execution of command GRANT forbidden
LOCATION:  exec_stmt_raise, pl_exec.c:3068
=# DROP EVENT TRIGGER abort_grant_trigger;
DROP EVENT TRIGGER
=# CREATE EVENT TRIGGER abort_grant_trigger ON ddl_command_end
WHEN TAG IN ('GRANT', 'REVOKE') EXECUTE PROCEDURE abort_grant();
CREATE EVENT TRIGGER
=# REVOKE SELECT ON aa FROM foo;
NOTICE:  0: Event: ddl_command_end
LOCATION:  exec_stmt_raise, pl_exec.c:3068
ERROR:  P0001: Execution of command REVOKE forbidden
LOCATION:  exec_stmt_raise, pl_exec.c:3068

Patch 4: Shouldn't int32 be used instead of uint32 in the declaration
of CommentObject? And yes, adding support for only ddl_command_start
and ddl_command_end is enough. Let's not play with dropped objects in
this area... Similarly to the test above:
=# comment on table aa is 'test';
NOTICE:  0: Event: ddl_command_start
LOCATION:  exec_stmt_raise, pl_exec.c:3068
ERROR:  P0001: Execution of command COMMENT forbidden
LOCATION:  exec_stmt_raise, pl_exec.c:3068
Regards,
-- 
Michael
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index d77b3ee..fc8dc80 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -57,6 +57,7 @@
 #include 

Re: [HACKERS] Add CREATE support to event triggers

2014-09-19 Thread Michael Paquier
On Tue, Aug 26, 2014 at 11:14 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 And a last one before lunch, closing the review for all the basic things...
 Patch 13: Could you explain why this is necessary?
 +extern PGDLLIMPORT bool creating_extension;
 It may make sense by looking at the core features (then why isn't it
 with the core features?), but I am trying to review the patches in
 order.
Those patches have been reviewed up to number 14. Some of them could
be applied as-is as they are useful taken independently, but most of
them need a rebase, hence marking it as returned with feedback.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-08-26 Thread Michael Paquier
On Tue, Aug 26, 2014 at 8:10 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Well, I like the patch series for what it counts as long as you can
 submit it as such. That's far easier to test and certainly helps in
 spotting issues when kicking different code paths.

So, for patch 2, which is a cosmetic change for
pg_event_trigger_dropped_objects:
=# select pg_event_trigger_dropped_objects();
ERROR:  0A000: pg_event_trigger_dropped_objects can only be called in
a sql_drop event trigger function
LOCATION:  pg_event_trigger_dropped_objects, event_trigger.c:1220
This drops () from the error message with the function name. I guess
that this is fine. But PG_FUNCNAME_MACRO is used nowhere except
elog.h, and can as well be NULL. So if that's really necessary
shouldn't we use FunctionCallInfo instead. It is not as well not that
bad to hardcode the function name in the error message as well IMO.

For patch 5:
+1 for this move. When working on Postgres-XC a couple of years back I
wondered why this distinction was not made. Wouldn't it make sense to
move as well the declaration of quote_all_identifiers to ruleutils.h.
That's a GUC and moving it out of builtins.h would make sense IMO.

Patch 8 needs a rebase (patch independent on 1~6 it seems):
1 out of 57 hunks FAILED -- saving rejects to file
src/backend/commands/tablecmds.c.rej
(Stripping trailing CRs from patch.)

Patch 9:
1) It needs a rebase, AlterTableMoveAllStmt has been renamed to
AlterTableMoveAllStmt by commit 3c4cf08
2) Table summarizing event trigger firing needs to be updated with the
new command supported (src/sgml/event-trigger.sgml)

Patch 10, similar problems as patch 9:
1) Needs a rebase
2) table summarizing supported commands should be updated.
You could directly group patches 9 and 10 in the final commit IMO.
GRANT/REVOKE would also be the first command that would be supported
by event triggers that is not of the type CREATE/DROP/ALTER, hence
once it is rebased I would like to do some testing with it (same with
patch 9 btw) and see how it reacts with the event sql_drop
particularly (it should error out but still).

Patch 11: applies with some hunks.
So... This patch introduces an operation performing doing reverse
engineering of format_type_internal... I think that this would be more
adapted with a couple of new routines in lsyscache.[c|h] instead:
- One for the type name
- One for typmodout
- One for is_array
- One for its namespace
TBH, I wanted those routines a couple of times when working on XC and
finished coding them at the end, but in XC only :)

Patch 12: patch applies correctly.
Form_pg_sequence is already exposed in sequence.h even if it is only
used in sequence.c, so yes it seems to be the correct way to do it
here assuming that we need this data to rebuild a DDL. Why is
ACL_UPDATE needed when checking permissions? This new routine only
reads the values and does not update it. And a confirmation: ACL_USAGE
is used to make this routine usable for PL languages in this case,
right?
I think that you should mention at the top of get_sequence_values that
it returns a palloc'd result, and that it is the responsibility of
caller to free it.

That's all I have for now.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-08-26 Thread Michael Paquier
On Wed, Aug 27, 2014 at 1:10 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Tue, Aug 26, 2014 at 8:10 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Well, I like the patch series for what it counts as long as you can
 submit it as such. That's far easier to test and certainly helps in
 spotting issues when kicking different code paths.

 So, for patch 2, which is a cosmetic change for
 pg_event_trigger_dropped_objects:
 =# select pg_event_trigger_dropped_objects();
 ERROR:  0A000: pg_event_trigger_dropped_objects can only be called in
 a sql_drop event trigger function
 LOCATION:  pg_event_trigger_dropped_objects, event_trigger.c:1220
 This drops () from the error message with the function name. I guess
 that this is fine. But PG_FUNCNAME_MACRO is used nowhere except
 elog.h, and can as well be NULL. So if that's really necessary
 shouldn't we use FunctionCallInfo instead. It is not as well not that
 bad to hardcode the function name in the error message as well IMO.

 For patch 5:
 +1 for this move. When working on Postgres-XC a couple of years back I
 wondered why this distinction was not made. Wouldn't it make sense to
 move as well the declaration of quote_all_identifiers to ruleutils.h.
 That's a GUC and moving it out of builtins.h would make sense IMO.

 Patch 8 needs a rebase (patch independent on 1~6 it seems):
 1 out of 57 hunks FAILED -- saving rejects to file
 src/backend/commands/tablecmds.c.rej
 (Stripping trailing CRs from patch.)

 Patch 9:
 1) It needs a rebase, AlterTableMoveAllStmt has been renamed to
 AlterTableMoveAllStmt by commit 3c4cf08
 2) Table summarizing event trigger firing needs to be updated with the
 new command supported (src/sgml/event-trigger.sgml)

 Patch 10, similar problems as patch 9:
 1) Needs a rebase
 2) table summarizing supported commands should be updated.
 You could directly group patches 9 and 10 in the final commit IMO.
 GRANT/REVOKE would also be the first command that would be supported
 by event triggers that is not of the type CREATE/DROP/ALTER, hence
 once it is rebased I would like to do some testing with it (same with
 patch 9 btw) and see how it reacts with the event sql_drop
 particularly (it should error out but still).

 Patch 11: applies with some hunks.
 So... This patch introduces an operation performing doing reverse
 engineering of format_type_internal... I think that this would be more
 adapted with a couple of new routines in lsyscache.[c|h] instead:
 - One for the type name
 - One for typmodout
 - One for is_array
 - One for its namespace
 TBH, I wanted those routines a couple of times when working on XC and
 finished coding them at the end, but in XC only :)

 Patch 12: patch applies correctly.
 Form_pg_sequence is already exposed in sequence.h even if it is only
 used in sequence.c, so yes it seems to be the correct way to do it
 here assuming that we need this data to rebuild a DDL. Why is
 ACL_UPDATE needed when checking permissions? This new routine only
 reads the values and does not update it. And a confirmation: ACL_USAGE
 is used to make this routine usable for PL languages in this case,
 right?
 I think that you should mention at the top of get_sequence_values that
 it returns a palloc'd result, and that it is the responsibility of
 caller to free it.

And a last one before lunch, closing the review for all the basic things...
Patch 13: Could you explain why this is necessary?
+extern PGDLLIMPORT bool creating_extension;
It may make sense by looking at the core features (then why isn't it
with the core features?), but I am trying to review the patches in
order.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-08-25 Thread Michael Paquier
On Sat, Jun 14, 2014 at 5:31 AM, Alvaro Herrera alvhe...@2ndquadrant.com
wrote:

 Here's a refreshed version of this patch.  I have split it up in a
 largish number of pieces, which hopefully makes it easier to understand
 what is going on.

Alvaro,

Could you confirm that the patches you just committed are 1, 3 and 6?

Regards,
-- 
Michael


Re: [HACKERS] Add CREATE support to event triggers

2014-08-25 Thread Alvaro Herrera
Michael Paquier wrote:
 On Sat, Jun 14, 2014 at 5:31 AM, Alvaro Herrera alvhe...@2ndquadrant.com
 wrote:
 
  Here's a refreshed version of this patch.  I have split it up in a
  largish number of pieces, which hopefully makes it easier to understand
  what is going on.
 
 Alvaro,
 
 Could you confirm that the patches you just committed are 1, 3 and 6?

And 4.  Yes, they are.  I wanted to get trivial stuff out of the way
while I had some other trivial patch at hand.  I'm dealing with another
patch from the commitfest now, so I'm not posting a rebased version
right away, apologies.

How do people like this patch series?  It would be much easier for me to
submit a single patch, but I feel handing it in little pieces makes it
easier for reviewers.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-08-25 Thread Michael Paquier
On Tue, Aug 26, 2014 at 5:33 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Michael Paquier wrote:
 And 4.  Yes, they are.  I wanted to get trivial stuff out of the way
 while I had some other trivial patch at hand.  I'm dealing with another
 patch from the commitfest now, so I'm not posting a rebased version
 right away, apologies.
No problems. I imagine that most of the patches still apply.

 How do people like this patch series?  It would be much easier for me to
 submit a single patch, but I feel handing it in little pieces makes it
 easier for reviewers.
Well, I like the patch series for what it counts as long as you can
submit it as such. That's far easier to test and certainly helps in
spotting issues when kicking different code paths.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-06-14 Thread Jim Nasby

On 2/6/14, 11:20 AM, Alvaro Herrera wrote:

NOTICE:  JSON blob: {
 definition: [
 {
 clause: owned,
 fmt: OWNED BY %{owner}D,
 owner: {
 attrname: a,
 objname: t1,
 schemaname: public
 }
 }
 ],
 fmt: ALTER SEQUENCE %{identity}D %{definition: }s,
 identity: {
 objname: t1_a_seq,
 schemaname: public
 }
}
NOTICE:  expanded: ALTER SEQUENCE public.t1_a_seq OWNED BY public.t1.a


Apologies if this has been discussed and I missed it, but shouldn't part of the 
JSON be a field that indicates what command is being run? It doesn't seem wise 
to conflate detecting what the command is with the overall format string.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-06-14 Thread Alvaro Herrera
Jim Nasby wrote:
 On 2/6/14, 11:20 AM, Alvaro Herrera wrote:
 NOTICE:  JSON blob: {
  definition: [
  {
  clause: owned,
  fmt: OWNED BY %{owner}D,
  owner: {
  attrname: a,
  objname: t1,
  schemaname: public
  }
  }
  ],
  fmt: ALTER SEQUENCE %{identity}D %{definition: }s,
  identity: {
  objname: t1_a_seq,
  schemaname: public
  }
 }
 NOTICE:  expanded: ALTER SEQUENCE public.t1_a_seq OWNED BY public.t1.a
 
 Apologies if this has been discussed and I missed it, but shouldn't part of 
 the JSON be a field that indicates what command is being run? It doesn't seem 
 wise to conflate detecting what the command is with the overall format string.

That's reported as a separate field by the
pg_event_trigger_creation_commands function.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-03-14 Thread Robert Haas
On Thu, Mar 13, 2014 at 5:06 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Alvaro Herrera escribió:

 I also fixed the sequence OWNED BY problem simply by adding support for
 ALTER SEQUENCE.  Of course, the intention is that all forms of CREATE
 and ALTER are supported, but this one seems reasonable standalone
 because CREATE TABLE uses it internally.

 I have been hacking on this on and off.  This afternoon I discovered
 that interval typmod output can also be pretty unusual.  Example:

 create table a (a interval year to month);

 For the column, we get this type spec (note the typmod):

 coltype: {
 is_array: false,
 schemaname: pg_catalog,
 typename: interval,
 typmod:  year to month
 },

 so the whole command output ends up being this:

 NOTICE:  expanded: CREATE  TABLE  public.a (a pg_catalog.interval year to 
 month   )WITH (oids=OFF)

 However, this is not accepted on input:

 alvherre=# CREATE  TABLE  public.a (a pg_catalog.interval year to month   ) 
WITH (oids=OFF);
 ERROR:  syntax error at or near year
 LÍNEA 1: CREATE  TABLE  public.a (a pg_catalog.interval year to mon...
   ^

 I'm not too sure what to do about this yet.  I checked the catalogs and
 gram.y, and it seems that interval is the only type that allows such
 strange games to be played.  I would hate to be forced to add a kludge
 specific to type interval, but that seems to be the only option.  (This
 would involve checking the OID of the type in deparse_utility.c, and if
 it's INTERVALOID, then omit the schema qualification and quoting on the
 type name).

 I have also been working on adding ALTER TABLE support.  So far it's
 pretty simple; here is an example.  Note I run a single command which
 includes a SERIAL column, and on output I get three commands (just like
 a serial column on create table).

 alvherre=# alter table tt add column b numeric, add column c serial, alter 
 column a set default extract(epoch from now());
 NOTICE:  JSON blob: {
 definition: [
 {
 clause: cache,
 fmt: CACHE %{value}s,
 value: 1
 },
 {
 clause: cycle,
 fmt: %{no}s CYCLE,
 no: NO
 },
 {
 clause: increment_by,
 fmt: INCREMENT BY %{value}s,
 value: 1
 },
 {
 clause: minvalue,
 fmt: MINVALUE %{value}s,
 value: 1
 },
 {
 clause: maxvalue,
 fmt: MAXVALUE %{value}s,
 value: 9223372036854775807
 },
 {
 clause: start,
 fmt: START WITH %{value}s,
 value: 1
 },
 {
 clause: restart,
 fmt: RESTART %{value}s,
 value: 1
 }
 ],
 fmt: CREATE %{persistence}s SEQUENCE %{identity}D %{definition: }s,
 identity: {
 objname: tt_c_seq,
 schemaname: public
 },
 persistence: 
 }

What does the colon-space in %{definition: }s mean?

In general, it seems like you're making good progress here, and I'm
definitely happier with this than with previous approaches, but I'm
still concerned about how maintainable it's going to be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers Reply-To:

2014-03-14 Thread Alvaro Herrera
ca+tgmoayo1e2tjjefrvhgq6tmgpyhg_4loy_fwrzl8txxgf...@mail.gmail.com

Robert Haas wrote:

 What does the colon-space in %{definition: }s mean?

It means it expects the definition element to be a JSON array, and
that it will format the elements by separating them with a space.  In
other DDL commands, there are things like %{table_element:, }s  which
means to separate with comma-space.  (In CREATE TABLE, these
table_elements might be column definitions or constraints).  It's a
pretty handy way to format various things.  The separator is arbitrary.

 In general, it seems like you're making good progress here, and I'm
 definitely happier with this than with previous approaches, but I'm
 still concerned about how maintainable it's going to be.

Thanks.

There are three parts to this code.  Two of them are infrastructure to
make it all work: one is the code to support creation of the JSON
values, that is, functions to add new elements to the tree that's
eventually going to become JSON (this is the approximately 640 lines at
the top of deparse_utility.c).  The second one is the expand
functionality, i.e. what turns the JSON back into text (bottom 700 lines
in event_trigger.c).  Both are fairly static now; while I was writing it
initially there was a lot of churn until I found an interface that made
the most sense.  I don't think these parts are going to cause much
trouble.

The third part is the bits that take a parse node and determine what
elements to put into JSON.  This is the part that is going to show the
most churn as DDL is modified.  But it's not a lot of code; for
instance, deparsing a CREATE SCHEMA node takes 30 lines.
90 lines to deparse CREATE RULE.
80 lines for CREATE INDEX.
280 lines of shared code for ALTER SEQUENCE and CREATE SEQUENCE.

I don't think we should be worried about there being a lot of extra code
to write as DDL is added or modified.  I do share your concern that
we're going to *forget* to write these things in the first place, unless
we do something to avoid that problem specifically.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers Reply-To:

2014-03-14 Thread Robert Haas
On Fri, Mar 14, 2014 at 12:00 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I don't think we should be worried about there being a lot of extra code
 to write as DDL is added or modified.  I do share your concern that
 we're going to *forget* to write these things in the first place, unless
 we do something to avoid that problem specifically.

That mirrors my concern.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-03-13 Thread Alvaro Herrera
Alvaro Herrera escribió:

 I also fixed the sequence OWNED BY problem simply by adding support for
 ALTER SEQUENCE.  Of course, the intention is that all forms of CREATE
 and ALTER are supported, but this one seems reasonable standalone
 because CREATE TABLE uses it internally.

I have been hacking on this on and off.  This afternoon I discovered
that interval typmod output can also be pretty unusual.  Example:

create table a (a interval year to month);

For the column, we get this type spec (note the typmod):

coltype: {
is_array: false, 
schemaname: pg_catalog, 
typename: interval, 
typmod:  year to month
}, 

so the whole command output ends up being this:

NOTICE:  expanded: CREATE  TABLE  public.a (a pg_catalog.interval year to 
month   )WITH (oids=OFF)

However, this is not accepted on input:

alvherre=# CREATE  TABLE  public.a (a pg_catalog.interval year to month   )   
 WITH (oids=OFF);
ERROR:  syntax error at or near year
LÍNEA 1: CREATE  TABLE  public.a (a pg_catalog.interval year to mon...
  ^

I'm not too sure what to do about this yet.  I checked the catalogs and
gram.y, and it seems that interval is the only type that allows such
strange games to be played.  I would hate to be forced to add a kludge
specific to type interval, but that seems to be the only option.  (This
would involve checking the OID of the type in deparse_utility.c, and if
it's INTERVALOID, then omit the schema qualification and quoting on the
type name).

I have also been working on adding ALTER TABLE support.  So far it's
pretty simple; here is an example.  Note I run a single command which
includes a SERIAL column, and on output I get three commands (just like
a serial column on create table).

alvherre=# alter table tt add column b numeric, add column c serial, alter 
column a set default extract(epoch from now());
NOTICE:  JSON blob: {
definition: [
{
clause: cache, 
fmt: CACHE %{value}s, 
value: 1
}, 
{
clause: cycle, 
fmt: %{no}s CYCLE, 
no: NO
}, 
{
clause: increment_by, 
fmt: INCREMENT BY %{value}s, 
value: 1
}, 
{
clause: minvalue, 
fmt: MINVALUE %{value}s, 
value: 1
}, 
{
clause: maxvalue, 
fmt: MAXVALUE %{value}s, 
value: 9223372036854775807
}, 
{
clause: start, 
fmt: START WITH %{value}s, 
value: 1
}, 
{
clause: restart, 
fmt: RESTART %{value}s, 
value: 1
}
], 
fmt: CREATE %{persistence}s SEQUENCE %{identity}D %{definition: }s, 
identity: {
objname: tt_c_seq, 
schemaname: public
}, 
persistence: 
}
NOTICE:  expanded: CREATE  SEQUENCE public.tt_c_seq CACHE 1 NO CYCLE INCREMENT 
BY 1 MINVALUE 1 MAXVALUE 9223372036854775807 START WITH 1 RESTART 1
NOTICE:  JSON blob: {
fmt: ALTER TABLE %{identity}D %{subcmds:, }s, 
identity: {
objname: tt, 
schemaname: public
}, 
subcmds: [
{
definition: {
collation: {
fmt: COLLATE %{name}D, 
present: false
}, 
coltype: {
is_array: false, 
schemaname: pg_catalog, 
typename: numeric, 
typmod: 
}, 
default: {
fmt: DEFAULT %{default}s, 
present: false
}, 
fmt: %{name}I %{coltype}T %{default}s %{not_null}s 
%{collation}s, 
name: b, 
not_null: , 
type: column
}, 
fmt: ADD COLUMN %{definition}s, 
type: add column
}, 
{
definition: {
collation: {
fmt: COLLATE %{name}D, 
present: false
}, 
coltype: {
is_array: false, 
schemaname: pg_catalog, 
typename: int4, 
typmod: 
}, 
default: {
default: 
pg_catalog.nextval('public.tt_c_seq'::pg_catalog.regclass), 
fmt: DEFAULT %{default}s
}, 
fmt: %{name}I %{coltype}T %{default}s %{not_null}s 
%{collation}s, 
name: c, 
not_null: , 
type: column
}, 
fmt: ADD COLUMN %{definition}s, 
type: add column
}, 
{
column: a, 
definition: pg_catalog.date_part('epoch'::pg_catalog.text, 
pg_catalog.now()), 

Re: [HACKERS] Add CREATE support to event triggers

2014-02-06 Thread Robert Haas
On Thu, Feb 6, 2014 at 12:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Feb 5, 2014 at 2:26 PM, Alvaro Herrera alvhe...@2ndquadrant.com 
 wrote:
 Then again, why is the behavior of schema-qualifying absolutely
 everything even desirable?

 Well, someone could create a collation in another schema with the same
 name as a system collation and the command would become ambiguous.

 Hmm, good point.  I guess we don't worry much about this with pg_dump
 because we assume that we're restoring into an empty database (and if
 not, the user gets to keep both pieces).  You're applying a higher
 standard here.

 Robert, that's just horsepucky.  pg_dump is very careful about schemas.
 It's also careful to not schema-qualify names unnecessarily, which is an
 intentional tradeoff to improve readability of the dump --- at the cost
 that the dump might break if restored into a nonempty database with
 conflicting objects.  In the case of data passed to event triggers,
 there's a different tradeoff to be made: people will probably value
 consistency over readability, so always-qualify is probably the right
 choice here.  But in neither case are we being sloppy.

I didn't mean to imply otherwise.  I know the pg_dump tries to avoid
excess schema-qualification for readability among other reasons; what
I meant was that Alvaro is applying a higher standard specifically in
regards to replayability.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-02-05 Thread Robert Haas
On Tue, Feb 4, 2014 at 12:11 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I have run into some issues, though:

 1. certain types, particularly timestamp/timestamptz but really this
 could happen for any type, have unusual typmod output behavior.  For
 those one cannot just use the schema-qualified catalog names and then
 append the typmod at the end; because what you end up is something like
pg_catalog.timestamptz(4) with time zone
 because, for whatever reason, the with time zone is part of typmod
 output.  But this doesn't work at all for input.  I'm not sure how to
 solve this.

How about doing whatever pg_dump does?

 2. I have been having object definitions be emitted complete; in
 particular, sequences have OWNED BY clauses when they have an owning
 column.  But this doesn't work with a SERIAL column, because we get
 output like this:

 alvherre=# CREATE  TABLE public.hijo  (b serial);
 NOTICE:  expanded: CREATE  SEQUENCE public.hijo_b_seq INCREMENT BY 1 MINVALUE 
 1 MAXVALUE 9223372036854775807 START WITH 1 CACHE 1 NO CYCLE OWNED BY 
 public.hijo.b
 NOTICE:  expanded: CREATE  TABLE public.hijo  (b pg_catalog.int4 DEFAULT 
 nextval('hijo_b_seq'::regclass) NOT NULL )

 which is all nice, except that the sequence is using the column name as
 owner before the column has been created in the first place.  Both these
 command will, of course, fail, because both depend on the other to have
 been executed first.  The tie is easy to break in this case: just don't
 emit the OWNED BY clause .. but it makes me a bit nervous to be
 hardcoding the decision of parts that might depend on others.  OTOH
 pg_dump already knows how to split objects in constituent parts as
 necessary; maybe it's not so bad.

Well, the sequence can't depend on a table column that doesn't exist
yet, so if it's in effect doing what you've shown there, it's
cheating by virtue of knowing that nobody can observe the
intermediate state.  Strictly speaking, there's nothing wrong with
emitting those commands just as you have them there; they won't run,
but if what you want to do is log what's happened rather than replay
it, that's OK.  Producing output that is actually executable is a
strictly harder problem than producing output that accurately
describes what happened.  As you say, pg_dump already splits things
and getting executable output out of this facility will require the
same kinds of tricks here.  This gets back to my worry about
maintaining two or three copies of the code that solve many of the
same problems in quite different ways...

 3. It is possible to coerce ruleutils.c to emit always-qualified names
 by using PushOverrideSearchPath() facility; but actually this doesn't
 always work, because some places in namespace.c believe that
 PG_CATALOG_NAMESPACE is always visible and so certain objects are not
 qualified.  In particular, text columns using default collation will be
 emitted as having collation default and not pg_catalog.default as I
 would have initially expected.  Right now it doesn't seem like this is a
 problem, but it is unusual.

We have a quote_all_identifiers flag.  We could have a
schema_qualify_all_identifiers flag, too.  Then again, why is the
behavior of schema-qualifying absolutely everything even desirable?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-02-05 Thread Alvaro Herrera
Robert Haas escribió:
 On Tue, Feb 4, 2014 at 12:11 AM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  I have run into some issues, though:
 
  1. certain types, particularly timestamp/timestamptz but really this
  could happen for any type, have unusual typmod output behavior.  For
  those one cannot just use the schema-qualified catalog names and then
  append the typmod at the end; because what you end up is something like
 pg_catalog.timestamptz(4) with time zone
  because, for whatever reason, the with time zone is part of typmod
  output.  But this doesn't work at all for input.  I'm not sure how to
  solve this.
 
 How about doing whatever pg_dump does?

We use format_type() for that as far as I know.  What it does
differently is use undecorated names defined by the standard for some
types, which are never schema qualified and are never ambiguous because
they are reserved words that would require quoting if used by
user-defined type names.  We can't use that here: somewhere upthread we
noticed issues when using those which is why we're now trying to use
catalog names instead of those special names.  (I don't think it's
impossible to use such names: we just need to ensure we handle quoting
correctly for the funny cases such as char and bit.)

One idea is to chop the typmod output string at the closing parens.
That seems to work well for the types that come with core code ... and
the rule with the funny string at the end after the parenthised part of
the typmod works only by type names with hardcoded productions in
gram.y, so there is no danger that outside code will have a problem with
that strategy.

(I verified PostGIS types with typmods just to see an example of
out-of-core code at work, and unsurprisingly it only emits stuff inside
parens.)

  2. I have been having object definitions be emitted complete; in
  particular, sequences have OWNED BY clauses when they have an owning
  column.  But this doesn't work with a SERIAL column, because we get
  output like this:

 Well, the sequence can't depend on a table column that doesn't exist
 yet, so if it's in effect doing what you've shown there, it's
 cheating by virtue of knowing that nobody can observe the
 intermediate state.  Strictly speaking, there's nothing wrong with
 emitting those commands just as you have them there; they won't run,
 but if what you want to do is log what's happened rather than replay
 it, that's OK.  Producing output that is actually executable is a
 strictly harder problem than producing output that accurately
 describes what happened.  As you say, pg_dump already splits things
 and getting executable output out of this facility will require the
 same kinds of tricks here.

Yeah.  Moreover this will require that this new event trigger code is
able to handle (at least certain kinds of) ALTER, which expands this
patch in scope by a wide margin.

Producing executable commands is an important goal.

 This gets back to my worry about maintaining two or three copies of
 the code that solve many of the same problems in quite different
 ways...

Agreed.  It would be real good to be able to use this code for pg_dump
too, but it seems dubious.  The two environments seem just too different
to be able to reuse anything.  But if you see a way, by all means shout.

  3. It is possible to coerce ruleutils.c to emit always-qualified names
  by using PushOverrideSearchPath() facility; but actually this doesn't
  always work, because some places in namespace.c believe that
  PG_CATALOG_NAMESPACE is always visible and so certain objects are not
  qualified.  In particular, text columns using default collation will be
  emitted as having collation default and not pg_catalog.default as I
  would have initially expected.  Right now it doesn't seem like this is a
  problem, but it is unusual.
 
 We have a quote_all_identifiers flag.  We could have a
 schema_qualify_all_identifiers flag, too.

Actually the bit about collations was just a garden variety bug: turns
out I was using a %{}I specifier (and correspondingly only the collation
name as a string) instead of %{}D and get_catalog_object_by_oid() to
match.  I'm not yet sure if this new flag you suggest will really be
needed, but thanks for the idea.

 Then again, why is the behavior of schema-qualifying absolutely
 everything even desirable?

Well, someone could create a collation in another schema with the same
name as a system collation and the command would become ambiguous.  For
example, this command
  create table f (a text collate POSIX);
creates a column whose collation is either public.POSIX or the system
POSIX collation, depending on whether public appears before pg_catalog
in search_path.  Having someone create a POSIX collation in a schema
that appears earlier than pg_catalog is pretty far-fetched, but ISTM
that we really need to cover that scenario.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 

Re: [HACKERS] Add CREATE support to event triggers

2014-02-05 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Robert Haas escribió:
 How about doing whatever pg_dump does?

 We use format_type() for that as far as I know.  What it does
 differently is use undecorated names defined by the standard for some
 types, which are never schema qualified and are never ambiguous because
 they are reserved words that would require quoting if used by
 user-defined type names.  We can't use that here: somewhere upthread we
 noticed issues when using those which is why we're now trying to use
 catalog names instead of those special names.  (I don't think it's
 impossible to use such names: we just need to ensure we handle quoting
 correctly for the funny cases such as char and bit.)

Yeah, but wouldn't that complexity also bubble into user code within the
event triggers?  Since there's no real need for SQL standard compliance
in this context, I think minimizing the number of weird formats is a
win.

 One idea is to chop the typmod output string at the closing parens.

+1.  The only reason timestamptypmodout works like that is that we're
trying to match the SQL standard's spelling of the type names, and
that committee apparently considers it an off day whenever they can't
invent some randomly-incompatible-with-everything syntax.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-02-05 Thread Robert Haas
On Wed, Feb 5, 2014 at 2:26 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Then again, why is the behavior of schema-qualifying absolutely
 everything even desirable?

 Well, someone could create a collation in another schema with the same
 name as a system collation and the command would become ambiguous.  For
 example, this command
   create table f (a text collate POSIX);
 creates a column whose collation is either public.POSIX or the system
 POSIX collation, depending on whether public appears before pg_catalog
 in search_path.  Having someone create a POSIX collation in a schema
 that appears earlier than pg_catalog is pretty far-fetched, but ISTM
 that we really need to cover that scenario.

Hmm, good point.  I guess we don't worry much about this with pg_dump
because we assume that we're restoring into an empty database (and if
not, the user gets to keep both pieces).  You're applying a higher
standard here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-02-05 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Feb 5, 2014 at 2:26 PM, Alvaro Herrera alvhe...@2ndquadrant.com 
 wrote:
 Then again, why is the behavior of schema-qualifying absolutely
 everything even desirable?

 Well, someone could create a collation in another schema with the same
 name as a system collation and the command would become ambiguous.

 Hmm, good point.  I guess we don't worry much about this with pg_dump
 because we assume that we're restoring into an empty database (and if
 not, the user gets to keep both pieces).  You're applying a higher
 standard here.

Robert, that's just horsepucky.  pg_dump is very careful about schemas.
It's also careful to not schema-qualify names unnecessarily, which is an
intentional tradeoff to improve readability of the dump --- at the cost
that the dump might break if restored into a nonempty database with
conflicting objects.  In the case of data passed to event triggers,
there's a different tradeoff to be made: people will probably value
consistency over readability, so always-qualify is probably the right
choice here.  But in neither case are we being sloppy.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-30 Thread Dimitri Fontaine
Hi,

Alvaro Herrera alvhe...@2ndquadrant.com writes:
 So here's a patch implementing the ideas expressed in this thread.
 There are two new SQL functions:

I spent some time reviewing the patch and tried to focus on a higher
level review, as I saw Andres already began with the lower level stuff.

The main things to keep in mind here are:

  - this patch enables running Event Triggers anytime a new object is
created, in a way that the user code is run once the object already
made it through the catalogs;

  - the Event Trigger code has access to the full details about every
created object, so it's not tied to a command but really the fact
that an object just was created in the catalogs;

   (it's important with serial and primary key sub-commands)

  - facilities are provided so that it's possible to easily build an SQL
command that if executed would create the exact same object again;

  - the facilities around passing the created object details and
building a SQL command are made in such a way that it's trivially
possible to hack away the captured objects properties before
producing again a new SQL command.

After careful study and thinking, it appears to me that the proposed
patch addresses the whole range of features we expect here, and is both
flexible enough for the users and easy enough to maintain.

The event being fired once the objects are available in the catalogs
makes it possible for the code providing the details in the JSON format
to complete the parsetree with necessary information.

Current state of the patch is not ready for commit yet, independant of
code details some more high-level work needs to be done:

  - preliminary commit

It might be a good idea to separate away some pre-requisites of this
patch and commit them separately: the OID publishing parts and
allowing an Event Trigger to get fired after CREATE but without the
extra detailed JSON formated information might be good commits
already, and later add the much needed details about what did
happen.

  - document the JSON format

I vote for going with the proposed format, because it actually
allows to implement both the audit and replication features we want,
with the capability of hacking schema, data types, SQL
representation etc; and because I couldn't think of anything better
than what's proposed here ;-)

  - other usual documentation

I don't suppose I have to expand on what I mean here…

  - fill-in other commands

Not all commands are supported in the submitted patch. I think once
we have a clear documentation on the general JSON formating and how
to use it as a user, we need to include support for all CREATE
commands that we have.

I see nothing against extending when this work has to bo done until
after the CF, as long as it's fully done before beta. After all it's
only about filling in minutia at this point.

  - review the JSON producing code

It might be possible to use more of the internal supports for JSON
now that the format is freezing.

  - regression tests

The patch will need some. The simpler solution is to add a new
regression test entry and exercise all the CREATE commands in there,
in a specific schema, activating an event trigger that outputs the
JSON detailed information each time (the snitch() example).

Best would be to have some pretty indented output of the JSON to
help with reviewing diffs, and I have to wonder about JSON object
inner-ordering if we're going to do that.

No other ideas on this topic from me.

 The JSON parsing is done in event_trigger.c.  This code should probably
 live elsewhere, but I again hesitate to put it in json.c or jsonfuncs.c,
 at least until some discussion about its general applicability takes
 place.

I see that as useful enough if it can be made to work without the
special fmt fields somehow, with a nice default formatting ability.

In particular, being able to build some intermediate object with
json_agg then call the formating/expanding function on top of that might
be quite useful.

That said, I don't think we have enough time to attack this problem now,
I think it would be wiser to address your immediate problem separately
in your patch and clean it later (next release) with sharing code and
infrastructure and offering a more generally useful tool. At least we
will have some feedback about the Event Trigger specific context then.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-23 Thread Andres Freund
On 2014-01-15 02:11:11 -0300, Alvaro Herrera wrote:
 Then execute commands to your liking.

So, I am giving a quick look, given that I very likely will have to
write a consumer for it...

* deparse_utility_command returns payload via parameter, not return
  value?
* functions beginning with an underscore formally are reserved, we
  shouldn't add new places using such a convention.
* I don't think dequote_jsonval is correct as is, IIRC that won't
  correctly handle unicode escapes and such. I think json.c needs to
  expose functionality for this.
* I wonder if expand_jsonval_identifier shouldn't truncate as well? It
  should get handled by the individual created commands, right?
* So, if I read things correctly, identifiers in json are never
  downcased, is that correct? I.e. they are implicitly quoted?
* Why must we not schema qualify system types
  (c.f. expand_jsonval_typename)? It seems to be perfectly sensible to
  me to just use pg_catalog there.
* It looks like pg_event_trigger_expand_command will misparse nested {,
  error out instead?
* I wonder if deparseColumnConstraint couldn't somehow share a bit more
  code with ruleutils.c's pg_get_constraintdef_worker().
* I guess you know, but deparseColumnConstraint() doesn't handle foreign
  keys yet.
* Is tcop/ the correct location for deparse_utility.c? I wonder if it
  shouldn't be alongside ruleutils.c instead.
* shouldn't pg_event_trigger_get_creation_commands return command as
  json instead of text?

* Not your department, but a builtin json pretty printer would be
  really, really handy. Something like
CREATE FUNCTION json_prettify(j json)
RETURNS TEXT AS $$
import json
return json.dumps(json.loads(j), sort_keys=True, indent=4)
$$ LANGUAGE PLPYTHONU;
 makes the json so much more readable.

Some minimal tests:
* CREATE SEQUENCE errors out with:
NOTICE:  JSON blob: 
{sequence:{relation:frakbar2,schema:public},persistence:,fmt:CREATE
 %{persistence}s SEQUENCE %{identity}D}
ERROR:  non-existant element identity in JSON formatting object
*CREATE TABLE frakkbar2(id int); error out with:
postgres=# CREATE TABLE frakkbar2(id int);
NOTICE:  JSON blob: 
{on_commit:{present:false,on_commit_value:null,fmt:ON COMMIT 
%{on_commit_value}s},tablespace:{present:false,tablespace:null,fmt:TABLESPACE
 %{tablespace}I},inherits:{present:false,parents:null,fmt:INHERITS 
(%{parents:, 
}D)},table_elements:[{collation:{present:false,fmt:COLLATE 
%{name}I},type:{typmod:,typename:integer,is_system:true,is_array:false},name:id,fmt:%{name}I
 %{type}T %{collation}s}],of_type:{present:false,of_type:null,fmt:OF 
%{of_type}T},if_not_exists:,identity:{relation:frakkbar2,schema:public},persistence:,fmt:CREATE
 %{persistence}s TABLE %{identity}D %{if_not_exists}s %{of_type}s 
(%{table_elements:, }s) %{inherits}s %{on_commit}s %{tablespace}s}
ERROR:  invalid NULL is_system flag in %T element
CONTEXT:  PL/pgSQL function snitch() line 8 at RAISE

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-23 Thread Alvaro Herrera
Andres Freund escribió:

 * Why must we not schema qualify system types
   (c.f. expand_jsonval_typename)? It seems to be perfectly sensible to
   me to just use pg_catalog there.

So, the reason for doing things this way is to handle cases like
varchar(10) being turned into character varying; and that name
requires that the typename NOT be schema-qualified, otherwise it fails.
But thinking about this again, I don't see a reason why this can't be
returned simply as pg_catalog.varchar(10); this should work fine on the
receiving end as well, and give the same result.

The other cases I'm worried about are types like bit(1) vs. unadorned
bit vs. double-quoted bit, and char, etc.  I'm not sure I'm dealing
with them correctly right now.  So even if by the above paragraph I
could make the is_system thingy go away, I might still need it to cover
this case.

Thanks for the review, I will post an updated version later after fixing
the other issues you mentioned plus adding support for more commands.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-23 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 So, the reason for doing things this way is to handle cases like
 varchar(10) being turned into character varying; and that name
 requires that the typename NOT be schema-qualified, otherwise it fails.
 But thinking about this again, I don't see a reason why this can't be
 returned simply as pg_catalog.varchar(10); this should work fine on the
 receiving end as well, and give the same result.

I think people would be unhappy if we changed the output of, say, pg_dump
that way.  But it's presumably not a problem for strings inside event
triggers.  Once upon a time, the typmods would have been an issue, but now
that we support them in generic typename syntax I think we're good.

 The other cases I'm worried about are types like bit(1) vs. unadorned
 bit vs. double-quoted bit, and char, etc.  I'm not sure I'm dealing
 with them correctly right now.  So even if by the above paragraph I
 could make the is_system thingy go away, I might still need it to cover
 this case.

Yeah, there are some weird cases there.  I think you can make them go away
if you always print the fully qualified type name, ie don't omit
pg_catalog, and use pg_type.typname not any converted version (and don't
forget to double-quote anything that might be a reserved word).
But I've not looked closely at the code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-14 Thread Pavel Stehule
Hello


2014/1/13 Alvaro Herrera alvhe...@2ndquadrant.com

 Alvaro Herrera escribió:

  In an event trigger, the function
 pg_event_trigger_get_creation_commands()
  returns the following JSON blob:

 After playing with this for a while, I realized something that must have
 seemed quite obvious to those paying attention: what this function is,
 is just a glorified sprintf() for JSON.  So I propose we take our
 existing format(text) and use it to model a new format(json) function,
 which will be useful to the project at hand and be of more general
 applicability.

 To make it a better fit, I have changed the spec slightly.  The format
 string is now the fmt element in the topmost JSON.  This format string
 can contain % escapes, which consist of:

 * the literal % itself
 * an element name, enclosed in braces { }.  The name can optionally be
   followed by a colon and a possibly-empty array separator.
 * a format specifier, which can be I (identifier), D (dotted name), or s
   (string)
 * Alternatively, %% expands to a literal %, as usual.

 For each such escape, the JSON object is searched using the element name
 as key.  For identifiers, the element is expected to be a string, and
 will be quoted per identifier quoting rules.  Dotted-names are used to
 format possibly-qualified relation names and such; the element must be
 an object with one, two or three string elements, each of which is
 quoted per identifier rules, and output separated by periods.

 Finally, for arrays we expand each element in the JSON array element,
 and separate them with the separator specified in the {} part of the
 format specifier.

 For instance,
 alvherre=# select format(json '{fmt:hello, %{who}s! This is %{name}I,
 who:world, name:a function}');
format
 --
  hello, world! This is a function

 Elements can be objects, in which case they are expanded recursively: a
 fmt element is looked up and expanded as described above.


 I don't yet see a need for %L escapes (that is, literals that can expand
 to a single-quoted value or to NULL), but if I see it I will add that
 too.


I am not against to this idea, although I don't see a strong benefit. .
Just special function can be better - it has minimal relation to variadic
function format - and nested mixed format can be messy



 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Add CREATE support to event triggers

2014-01-14 Thread Robert Haas
On Fri, Jan 10, 2014 at 6:22 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Here's one idea: create a contrib module that (somehow, via APIs to be
 invented) runs every DDL command that gets executed through the
 deparsing code, and then parses the result and executes *that* instead
 of the original command.  Then, add a build target that runs the
 regression test suite in that mode, and get the buildfarm configured
 to run that build target regularly on at least some machines.  That
 way, adding syntax to the regular regression test suite also serves to
 test that the deparsing logic for that syntax is working.  If we do
 this, there's still some maintenance burden associated with having DDL
 deparsing code, but at least our chances of noticing when we've failed
 to maintain it should be pretty good.

 I gave this some more thought and hit a snag.  The problem here is that
 by the time the event trigger runs, the original object has already been
 created.  At that point, we can't simply replace the created objects
 with objects that would hypothetically be created by a command trigger.

Hmm, so these triggers are firing after the corresponding actions have
been performed.  Yeah, that's tricky.  I don't immediately have an
idea how to come up with a comprehensive test framework for that,
although I do think that the way you're structuring the JSON blobs
makes this a whole lot less error-prone than the approaches that have
been explored previously.  The big question in my mind is still if
somebody updates the grammar in gram.y, how do we make sure that they
notice that this stuff needs to be updated to match?.  Ideally the
answer is if they don't, the regression tests fail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-14 Thread Jim Nasby

On 1/14/14, 2:05 PM, Robert Haas wrote:

On Fri, Jan 10, 2014 at 6:22 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

Here's one idea: create a contrib module that (somehow, via APIs to be
invented) runs every DDL command that gets executed through the
deparsing code, and then parses the result and executes *that* instead
of the original command.  Then, add a build target that runs the
regression test suite in that mode, and get the buildfarm configured
to run that build target regularly on at least some machines.  That
way, adding syntax to the regular regression test suite also serves to
test that the deparsing logic for that syntax is working.  If we do
this, there's still some maintenance burden associated with having DDL
deparsing code, but at least our chances of noticing when we've failed
to maintain it should be pretty good.


I gave this some more thought and hit a snag.  The problem here is that
by the time the event trigger runs, the original object has already been
created.  At that point, we can't simply replace the created objects
with objects that would hypothetically be created by a command trigger.


Hmm, so these triggers are firing after the corresponding actions have
been performed.  Yeah, that's tricky.  I don't immediately have an
idea how to come up with a comprehensive test framework for that,
although I do think that the way you're structuring the JSON blobs
makes this a whole lot less error-prone than the approaches that have
been explored previously.  The big question in my mind is still if
somebody updates the grammar in gram.y, how do we make sure that they
notice that this stuff needs to be updated to match?.  Ideally the
answer is if they don't, the regression tests fail.


Can't we capture the JSON, use that to create a new database, and then test 
against that? Unfortunately it'd be easiest to do that at a high level and not 
for individual commands, but it's a lot better than nothing...
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-13 Thread Alvaro Herrera
Alvaro Herrera escribió:

 In an event trigger, the function pg_event_trigger_get_creation_commands()
 returns the following JSON blob:

After playing with this for a while, I realized something that must have
seemed quite obvious to those paying attention: what this function is,
is just a glorified sprintf() for JSON.  So I propose we take our
existing format(text) and use it to model a new format(json) function,
which will be useful to the project at hand and be of more general
applicability.

To make it a better fit, I have changed the spec slightly.  The format
string is now the fmt element in the topmost JSON.  This format string
can contain % escapes, which consist of:

* the literal % itself
* an element name, enclosed in braces { }.  The name can optionally be
  followed by a colon and a possibly-empty array separator.
* a format specifier, which can be I (identifier), D (dotted name), or s
  (string)
* Alternatively, %% expands to a literal %, as usual.

For each such escape, the JSON object is searched using the element name
as key.  For identifiers, the element is expected to be a string, and
will be quoted per identifier quoting rules.  Dotted-names are used to
format possibly-qualified relation names and such; the element must be
an object with one, two or three string elements, each of which is
quoted per identifier rules, and output separated by periods.

Finally, for arrays we expand each element in the JSON array element,
and separate them with the separator specified in the {} part of the
format specifier.

For instance,
alvherre=# select format(json '{fmt:hello, %{who}s! This is %{name}I, 
who:world, name:a function}');
   format   
--
 hello, world! This is a function

Elements can be objects, in which case they are expanded recursively: a
fmt element is looked up and expanded as described above.


I don't yet see a need for %L escapes (that is, literals that can expand
to a single-quoted value or to NULL), but if I see it I will add that
too.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-11 Thread Simon Riggs
On 10 January 2014 23:22, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 On the subject of testing the triggers, Robert Haas wrote:

 Here's one idea: create a contrib module that (somehow, via APIs to be
 invented) runs every DDL command that gets executed through the
 deparsing code, and then parses the result and executes *that* instead
 of the original command.  Then, add a build target that runs the
 regression test suite in that mode, and get the buildfarm configured
 to run that build target regularly on at least some machines.  That
 way, adding syntax to the regular regression test suite also serves to
 test that the deparsing logic for that syntax is working.  If we do
 this, there's still some maintenance burden associated with having DDL
 deparsing code, but at least our chances of noticing when we've failed
 to maintain it should be pretty good.

 I gave this some more thought and hit a snag.  The problem here is that
 by the time the event trigger runs, the original object has already been
 created.  At that point, we can't simply replace the created objects
 with objects that would hypothetically be created by a command trigger.

Yes, all we are doing is firing a trigger that has access to the
information about the current object.

 A couple of very hand-wavy ideas:

 1. in the event trigger, DROP the original object and CREATE it as
 reported by the creation_commands SRF.

 2. Have ddl_command_start open a savepoint, and then roll it back in
 ddl_command_end, then create the object again.  Not sure this is doable
 because of the whole SPI nesting issue .. maybe with C-language event
 trigger functions?

We need to be clear what action we are testing exactly. Once we know
that we can come up with a test mechanism.

We can come up with various tests that test coverage but that may not
be enough. Mostly we're just splitting up fields from the DDL, like
object name into its own text field, as well as augmenting it with
database name. That's pretty simple and shouldn't require much
testing. This code seems likely to be prone to missing features much
more so than actual bugs. Missing a piece of useful information in the
JSON doesn't mean there is a bug.

If all the event trigger does is generate JSON, then all we need to do
is test that the trigger fired and generated well-formed JSON with the
right information content. That should be very simple and I would say
the simpler the better.
To do that we can run code to parse the JSON and produce formatted
text output like this...
Field|   Value
-+---
XX  afagaha
YY  aamnamna
The output can be derived from visual inspection, proving that we
generated the required JSON.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Simon Riggs
On 8 January 2014 20:42, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy;

Hmm, given in 9.3 it was OK to have only DROP event triggers, I think
it should be equally acceptable to have just CREATE, but without every
option on CREATE.  CREATE SCHEMA is easily the most complex thing here
and would be the command/event to deprioritise if we had any issues
getting this done/agreeing something for 9.4.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Robert Haas
On Thu, Jan 9, 2014 at 5:17 PM, Jim Nasby j...@nasby.net wrote:
 On 1/9/14, 11:58 AM, Alvaro Herrera wrote:
 Robert Haas escribió:

 On Wed, Jan 8, 2014 at 10:27 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:


 Hmm.  This seems like a reasonable thing to do, except that I would like
 the output to always be the constant, and have some other way to
 enable the clause or disable it.  With your present boolean:
 so

 if_not_exists: {output: IF NOT EXISTS,
present: true/false}


 Why not:

 if_not_exists: true/false


 Yeah, that's another option.  If we do this, though, the expansion
 function would have to know that an if_not_exist element expands to IF
 NOT EXISTS.  Maybe that's okay.  Right now, the expansion function is
 pretty stupid, which is nice.

 Yeah, the source side of this will always have to understand the nuances of
 every command; it'd be really nice to not burden the other side with that as
 well. The only downside I see is a larger JSON output, but meh.

 Another advantage is if you really wanted to you could modify the output
 formatting in the JSON doc to do something radically different if so
 inclined...

Yeah.  I wasn't necessarily objecting to the way Alvaro did it, just
asking why he did it that way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Robert Haas
On Fri, Jan 10, 2014 at 10:36 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 20:42, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy;

 Hmm, given in 9.3 it was OK to have only DROP event triggers, I think
 it should be equally acceptable to have just CREATE, but without every
 option on CREATE.  CREATE SCHEMA is easily the most complex thing here
 and would be the command/event to deprioritise if we had any issues
 getting this done/agreeing something for 9.4.

I don't know that I agree with that, but I guess we can cross that
bridge when we come to it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Simon Riggs
On 10 January 2014 15:48, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 10, 2014 at 10:36 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 20:42, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy;

 Hmm, given in 9.3 it was OK to have only DROP event triggers, I think
 it should be equally acceptable to have just CREATE, but without every
 option on CREATE.  CREATE SCHEMA is easily the most complex thing here
 and would be the command/event to deprioritise if we had any issues
 getting this done/agreeing something for 9.4.

 I don't know that I agree with that, but I guess we can cross that
 bridge when we come to it.

We've come to it...

You would prefer either everything or nothing?? On what grounds?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Robert Haas
On Fri, Jan 10, 2014 at 10:55 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 10 January 2014 15:48, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 10, 2014 at 10:36 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 20:42, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy;

 Hmm, given in 9.3 it was OK to have only DROP event triggers, I think
 it should be equally acceptable to have just CREATE, but without every
 option on CREATE.  CREATE SCHEMA is easily the most complex thing here
 and would be the command/event to deprioritise if we had any issues
 getting this done/agreeing something for 9.4.

 I don't know that I agree with that, but I guess we can cross that
 bridge when we come to it.

 We've come to it...

 You would prefer either everything or nothing?? On what grounds?

I hardly think I need to justify that position.  That's project policy
and always has been.  When somebody implements 50% of a feature, or
worse yet 95% of a feature, it violates the POLA for users and doesn't
always subsequently get completed, leaving us with long-term warts
that are hard to eliminate.  It's perfectly fine to implement a
feature incrementally if the pieces are individually self-consistent
and ideally even useful, but deciding to support every command except
one because the last one is hard to implement doesn't seem like a
principled approach to anything.  It's not even obvious to me that
CREATE SCHEMA is all that much harder than anything else and Alvaro
has not said that that's the only thing he can't implement (or why) so
I think it's entirely premature to make the decision now about which
way to proceed - but, OK, sure, if you want to force the issue now,
then yeah, I think it's better to have everything or nothing than to
have support for only some things justified by nothing more than
implementation complexity.

Aside from the general issue, in this particular case, I have
previously and repeatedly expressed concerns about regression test
coverage and suggested a path that would guarantee thorough regression
testing but which would require that support be complete for
everything present in our regression tests.  Although there may be
some other plan for guaranteeing thorough regression testing not only
now but going forward, I have not seen it proposed here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Simon Riggs
On 10 January 2014 17:07, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 10, 2014 at 10:55 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 10 January 2014 15:48, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 10, 2014 at 10:36 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 8 January 2014 20:42, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy;

 Hmm, given in 9.3 it was OK to have only DROP event triggers, I think
 it should be equally acceptable to have just CREATE, but without every
 option on CREATE.  CREATE SCHEMA is easily the most complex thing here
 and would be the command/event to deprioritise if we had any issues
 getting this done/agreeing something for 9.4.

 I don't know that I agree with that, but I guess we can cross that
 bridge when we come to it.

 We've come to it...

 You would prefer either everything or nothing?? On what grounds?

 I hardly think I need to justify that position.

Yeh, you do. Everybody does.

 That's project policy
 and always has been.  When somebody implements 50% of a feature, or
 worse yet 95% of a feature, it violates the POLA for users and doesn't
 always subsequently get completed, leaving us with long-term warts
 that are hard to eliminate.

So why was project policy violated when we released 9.3 with only DROP
event support? Surely that was a worse violation of POLA than my
suggestion?

It's not reasonable to do something yourself and then object when
others suggest doing the same thing.

After 3 years we need something useful. I think the perfect being the
enemy of the good argument applies here after this length of time.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Robert Haas
On Fri, Jan 10, 2014 at 12:59 PM, Simon Riggs si...@2ndquadrant.com wrote:
 That's project policy
 and always has been.  When somebody implements 50% of a feature, or
 worse yet 95% of a feature, it violates the POLA for users and doesn't
 always subsequently get completed, leaving us with long-term warts
 that are hard to eliminate.

 So why was project policy violated when we released 9.3 with only DROP
 event support? Surely that was a worse violation of POLA than my
 suggestion?

Well, obviously I didn't think so at the time, or I would have
objected.  I felt, and still feel, that implementing one kind of event
trigger (drop) does not necessarily require implementing another kind
(create).  I think that's clearly different from implementing either
one for only some object types.

This event trigger will be called whenever an object is dropped is a
reasonable contract with the user.  This other event trigger will be
called whenever an object is created, unless it happens to be a
schema is much less reasonable.

At least in my opinion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Simon Riggs
On 10 January 2014 18:17, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Jan 10, 2014 at 12:59 PM, Simon Riggs si...@2ndquadrant.com wrote:
 That's project policy
 and always has been.  When somebody implements 50% of a feature, or
 worse yet 95% of a feature, it violates the POLA for users and doesn't
 always subsequently get completed, leaving us with long-term warts
 that are hard to eliminate.

 So why was project policy violated when we released 9.3 with only DROP
 event support? Surely that was a worse violation of POLA than my
 suggestion?

 Well, obviously I didn't think so at the time, or I would have
 objected.  I felt, and still feel, that implementing one kind of event
 trigger (drop) does not necessarily require implementing another kind
 (create).  I think that's clearly different from implementing either
 one for only some object types.

 This event trigger will be called whenever an object is dropped is a
 reasonable contract with the user.  This other event trigger will be
 called whenever an object is created, unless it happens to be a
 schema is much less reasonable.

 At least in my opinion.

In the fullness of time, I agree that is not a restriction we should maintain.

Given that CREATE SCHEMA with multiple objects is less well used, its
a reasonable restriction to accept for one release, if the alternative
is to implement nothing at all of value. Especially since we are now
in the third year of development of this set of features, it is time
to reduce the scope to ensure delivery.

There may be other ways to ensure something of value is added, this
was just one suggestion.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Alvaro Herrera

Can we please stop arguing over a problem I don't have?  I started with
CREATE SCHEMA because it is one of the easy cases, not because it was
the most difficult case: we only need to deparse the bits of it that
don't involve the objects within, because those are reported by the
event trigger as separate commands.  Each object gets its own creation
command, which works pretty nicely.  Of course, the deparsed version of
the command will not look very much like what was submitted by the user,
but that doesn't matter -- what does matter is that the objects created
by running the commands reported in the event trigger will be (or should
be) the same as those created by the original command.

I was showing CREATE SCHEMA as a way to discuss the fine details: how to
report identifiers that might need quoting, what to do with optional
clauses (AUTHORIZATION), etc.  I am past that now.

On the subject of testing the triggers, Robert Haas wrote:

 Here's one idea: create a contrib module that (somehow, via APIs to be
 invented) runs every DDL command that gets executed through the
 deparsing code, and then parses the result and executes *that* instead
 of the original command.  Then, add a build target that runs the
 regression test suite in that mode, and get the buildfarm configured
 to run that build target regularly on at least some machines.  That
 way, adding syntax to the regular regression test suite also serves to
 test that the deparsing logic for that syntax is working.  If we do
 this, there's still some maintenance burden associated with having DDL
 deparsing code, but at least our chances of noticing when we've failed
 to maintain it should be pretty good.

I gave this some more thought and hit a snag.  The problem here is that
by the time the event trigger runs, the original object has already been
created.  At that point, we can't simply replace the created objects
with objects that would hypothetically be created by a command trigger.

A couple of very hand-wavy ideas:

1. in the event trigger, DROP the original object and CREATE it as
reported by the creation_commands SRF.

2. Have ddl_command_start open a savepoint, and then roll it back in
ddl_command_end, then create the object again.  Not sure this is doable
because of the whole SPI nesting issue .. maybe with C-language event
trigger functions?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Jim Nasby

On 1/10/14, 3:40 PM, Simon Riggs wrote:

Given that CREATE SCHEMA with multiple objects is less well used, its
a reasonable restriction to accept for one release, if the alternative
is to implement nothing at all of value. Especially since we are now
in the third year of development of this set of features, it is time
to reduce the scope to ensure delivery.


ISTM that since you can nest JSON then CREATE SCHEMA just needs to support 
providing an array of other JSON definitions, no? And it doesn't seem like it 
should be that hard to code that up.

That said, if it's a choice between getting a bunch of CREATE blah support or 
getting nothing because of CREATE SCHEMA with objects then as long as we have a 
scheme that supports the full CREATE SCHEMA feature we shouldn't hold up just 
because it's not coded.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-10 Thread Jim Nasby

On 1/10/14, 5:22 PM, Alvaro Herrera wrote:

Here's one idea: create a contrib module that (somehow, via APIs to be
invented) runs every DDL command that gets executed through the
deparsing code, and then parses the result and executes*that*  instead
of the original command.  Then, add a build target that runs the
regression test suite in that mode, and get the buildfarm configured
to run that build target regularly on at least some machines.  That
way, adding syntax to the regular regression test suite also serves to
test that the deparsing logic for that syntax is working.  If we do
this, there's still some maintenance burden associated with having DDL
deparsing code, but at least our chances of noticing when we've failed
to maintain it should be pretty good.

I gave this some more thought and hit a snag.  The problem here is that
by the time the event trigger runs, the original object has already been
created.  At that point, we can't simply replace the created objects
with objects that would hypothetically be created by a command trigger.

A couple of very hand-wavy ideas:

1. in the event trigger, DROP the original object and CREATE it as
reported by the creation_commands SRF.

2. Have ddl_command_start open a savepoint, and then roll it back in
ddl_command_end, then create the object again.  Not sure this is doable
because of the whole SPI nesting issue .. maybe with C-language event
trigger functions?


What if we don't try and do this all in one shot? I'm thinking let the original 
DDL do it's thing while capturing the re-parsed commands in order somewhere. 
Dump those commands into a brand new database and use that for testing.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-09 Thread Robert Haas
On Wed, Jan 8, 2014 at 10:27 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Craig Ringer escribió:
 Instead, can't we use your already proposed subclause structure?

 {authorization:{authorization_role:some guy,
   output:AUTHORIZATION %i{authorization_role}},
  if_not_exists: {output: IF NOT EXISTS},
  name:some schema,
  output:CREATE SCHEMA %{if_not_exists} %i{name} %{authorization}}

 i.e. if_not_exists becomes an object. All clauses are objects, all
 non-object values are user data. (right?). If the clause is absent, the
 output key is the empty string.

 The issue with that (and with your original proposal) is that you can't
 tell what these clauses are supposed to be if they're not present in the
 original query. You can't *enable* IF NOT EXISTS without pulling
 knowledge of that syntax from somewhere else.

 Depending on the problem you intend to solve there, that might be fine.

 Hmm.  This seems like a reasonable thing to do, except that I would like
 the output to always be the constant, and have some other way to
 enable the clause or disable it.  With your present boolean:
 so

 if_not_exists: {output: IF NOT EXISTS,
   present: true/false}

Why not:

if_not_exists: true/false

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-09 Thread Alvaro Herrera
Robert Haas escribió:
 On Wed, Jan 8, 2014 at 10:27 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:

  Hmm.  This seems like a reasonable thing to do, except that I would like
  the output to always be the constant, and have some other way to
  enable the clause or disable it.  With your present boolean:
  so
 
  if_not_exists: {output: IF NOT EXISTS,
present: true/false}
 
 Why not:
 
 if_not_exists: true/false

Yeah, that's another option.  If we do this, though, the expansion
function would have to know that an if_not_exist element expands to IF
NOT EXISTS.  Maybe that's okay.  Right now, the expansion function is
pretty stupid, which is nice.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-09 Thread Jim Nasby

On 1/9/14, 11:58 AM, Alvaro Herrera wrote:

Robert Haas escribió:

On Wed, Jan 8, 2014 at 10:27 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:



Hmm.  This seems like a reasonable thing to do, except that I would like
the output to always be the constant, and have some other way to
enable the clause or disable it.  With your present boolean:
so

if_not_exists: {output: IF NOT EXISTS,
   present: true/false}


Why not:

if_not_exists: true/false


Yeah, that's another option.  If we do this, though, the expansion
function would have to know that an if_not_exist element expands to IF
NOT EXISTS.  Maybe that's okay.  Right now, the expansion function is
pretty stupid, which is nice.


Yeah, the source side of this will always have to understand the nuances of 
every command; it'd be really nice to not burden the other side with that as 
well. The only downside I see is a larger JSON output, but meh.

Another advantage is if you really wanted to you could modify the output 
formatting in the JSON doc to do something radically different if so inclined...
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-08 Thread Alvaro Herrera
Alvaro Herrera escribió:
 Robert Haas escribió:
 
  I think this direction has some potential.  I'm not sure it's right in
  detail.  The exact scheme you propose above won't work if you want to
  leave out the schema name altogether, and more generally it's not
  going to help very much with anything other than substituting in
  identifiers.  What if you want to add a column called satellite_id to
  every table that gets created, for example?  What if you want to make
  the tables UNLOGGED?  I don't see how that kind of things is going to
  work at all cleanly.
 
 Thanks for the discussion.  I am building some basic infrastructure to
 make this possible, and will explore ideas to cover these oversights
 (not posting anything concrete yet because I expect several iterations
 to crash and burn before I have something sensible to post).

Here's a working example.  Suppose the user runs

CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy;

In an event trigger, the function pg_event_trigger_get_creation_commands()
returns the following JSON blob:

{authorization:{authorization_role:some guy,
  output:AUTHORIZATION %i{authorization_role}},
 if_not_exists:IF NOT EXISTS,
 name:some schema,
 output:CREATE SCHEMA %{if_not_exists} %i{name} %{authorization}}

wherein I have chosen to have a JSON element with the hardcoded name of
output which is what needs to be expanded; for each %{} parameter
found in it, there is an equally-named element in the JSON blob.  This
can be a string, a NULL, or another JSON object.

If it's a string, it expands to that value; if it's an object,
recursively an output element is expanded in the same way, and the
expanded string is used.

If there's a NULL element when expanding an object, the whole thing
expands to empty.  For example, if no AUTHORIZATION
clause is specified, authorization element is still there, but the
authorization_role element within it is NULL, and so the whole
AUTHORIZATION clause expands to empty and the resulting command contains
no authorization clause.  This is useful to support the case that
someone doesn't have an AUTHORIZATION clause in the CREATE SCHEMA
command, and the event trigger injects one simply by setting the
authorization_role to some role name.

IF NOT EXISTS is handled by defining it to either the string IF NOT
EXISTS or to empty if no such clause was specified.

The user can modify elements in the JSON to get a different version of
the command.  (I reckon the output can also be modified, but this is
probably a bad idea in most/all cases.  I don't think there's a need to
prohibit this explicitely.)  Also, someone might define if_not_exists
to something completely unrelated, but that would be their own fault.
(Maybe we can have some cross-check that the if_not_exists element in
JSON cannot be anything other than IF NOT EXISTS or the empty string;
and that the output element remains the same at expansion time than it
was at generation time.  Perhaps we should even hide the output
element from the user completely and only add them to the JSON at time
of expansion.  Not sure it's worth the trouble.)

There is another function,
pg_event_trigger_expand_creation_command(json), which will expand the
above JSON blob and return the following text:

CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy

Note the identifiers are properly quoted (there are quotes in the JSON
blob, but they correspond to JSON's own delimiters).  I have defined a
'i' modifier to have %i{} elements, which means that the element is an
identifier which might need quoting.

I have also defined a %d{} modifier that means to use the element to
expand a possibly-qualified dotted name.  (There would be no output
element in this case.)  This is to support the case where you have 

CREATE TABLE public.foo
which results in
{table_name:{schema:public,
   relname:foo}}

and you want to edit the table_name element in the root JSON and set
the schema to something else (perhaps NULL), so in the event trigger
after expansion you can end up with CREATE TABLE foo or CREATE TABLE
private.foo or whatever.

Most likely there are some more rules that will need to be created, but
so far this looks sensible.

I'm going to play some more with the %d{} stuff, and also with the idea
of representing table elements such as columns and constraints as an
array.  In the meantime please let me know whether this makes sense.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-08 Thread Pavel Stehule
Hello

I don't like this direction. What we can do with JSON from plpgsql? More,
JSON is not too robust format against some future changes.

Regards

Pavel
Dne 8.1.2014 21:43 Alvaro Herrera alvhe...@2ndquadrant.com napsal(a):

 Alvaro Herrera escribió:
  Robert Haas escribió:
 
   I think this direction has some potential.  I'm not sure it's right in
   detail.  The exact scheme you propose above won't work if you want to
   leave out the schema name altogether, and more generally it's not
   going to help very much with anything other than substituting in
   identifiers.  What if you want to add a column called satellite_id to
   every table that gets created, for example?  What if you want to make
   the tables UNLOGGED?  I don't see how that kind of things is going to
   work at all cleanly.
 
  Thanks for the discussion.  I am building some basic infrastructure to
  make this possible, and will explore ideas to cover these oversights
  (not posting anything concrete yet because I expect several iterations
  to crash and burn before I have something sensible to post).

 Here's a working example.  Suppose the user runs

 CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy;

 In an event trigger, the function pg_event_trigger_get_creation_commands()
 returns the following JSON blob:

 {authorization:{authorization_role:some guy,
   output:AUTHORIZATION %i{authorization_role}},
  if_not_exists:IF NOT EXISTS,
  name:some schema,
  output:CREATE SCHEMA %{if_not_exists} %i{name} %{authorization}}

 wherein I have chosen to have a JSON element with the hardcoded name of
 output which is what needs to be expanded; for each %{} parameter
 found in it, there is an equally-named element in the JSON blob.  This
 can be a string, a NULL, or another JSON object.

 If it's a string, it expands to that value; if it's an object,
 recursively an output element is expanded in the same way, and the
 expanded string is used.

 If there's a NULL element when expanding an object, the whole thing
 expands to empty.  For example, if no AUTHORIZATION
 clause is specified, authorization element is still there, but the
 authorization_role element within it is NULL, and so the whole
 AUTHORIZATION clause expands to empty and the resulting command contains
 no authorization clause.  This is useful to support the case that
 someone doesn't have an AUTHORIZATION clause in the CREATE SCHEMA
 command, and the event trigger injects one simply by setting the
 authorization_role to some role name.

 IF NOT EXISTS is handled by defining it to either the string IF NOT
 EXISTS or to empty if no such clause was specified.

 The user can modify elements in the JSON to get a different version of
 the command.  (I reckon the output can also be modified, but this is
 probably a bad idea in most/all cases.  I don't think there's a need to
 prohibit this explicitely.)  Also, someone might define if_not_exists
 to something completely unrelated, but that would be their own fault.
 (Maybe we can have some cross-check that the if_not_exists element in
 JSON cannot be anything other than IF NOT EXISTS or the empty string;
 and that the output element remains the same at expansion time than it
 was at generation time.  Perhaps we should even hide the output
 element from the user completely and only add them to the JSON at time
 of expansion.  Not sure it's worth the trouble.)

 There is another function,
 pg_event_trigger_expand_creation_command(json), which will expand the
 above JSON blob and return the following text:

 CREATE SCHEMA IF NOT EXISTS some schema AUTHORIZATION some guy

 Note the identifiers are properly quoted (there are quotes in the JSON
 blob, but they correspond to JSON's own delimiters).  I have defined a
 'i' modifier to have %i{} elements, which means that the element is an
 identifier which might need quoting.

 I have also defined a %d{} modifier that means to use the element to
 expand a possibly-qualified dotted name.  (There would be no output
 element in this case.)  This is to support the case where you have

 CREATE TABLE public.foo
 which results in
 {table_name:{schema:public,
relname:foo}}

 and you want to edit the table_name element in the root JSON and set
 the schema to something else (perhaps NULL), so in the event trigger
 after expansion you can end up with CREATE TABLE foo or CREATE TABLE
 private.foo or whatever.

 Most likely there are some more rules that will need to be created, but
 so far this looks sensible.

 I'm going to play some more with the %d{} stuff, and also with the idea
 of representing table elements such as columns and constraints as an
 array.  In the meantime please let me know whether this makes sense.

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 

Re: [HACKERS] Add CREATE support to event triggers

2014-01-08 Thread Alvaro Herrera
Pavel Stehule escribió:
 Hello
 
 I don't like this direction. What we can do with JSON from plpgsql?

We have plenty of JSON functions and operators in SQL, and more to come
soon.  Is that not enough?

 More, JSON is not too robust format against some future changes.

Not sure what you mean.  This JSON is generated and consumed by our own
code, so we only need to concern ourselves with making sure that we can
consume (in the expansion function) what we generated in the previous
phase.  There is no transmission to the exterior.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-08 Thread Alvaro Herrera
CC to hackers restored.

Pavel Stehule escribió:
 Dne 8.1.2014 23:17 Alvaro Herrera alvhe...@2ndquadrant.com napsal(a):
 
  Pavel Stehule escribió:
   Hello
  
   I don't like this direction. What we can do with JSON from plpgsql?
 
  We have plenty of JSON functions and operators in SQL, and more to come
  soon.  Is that not enough?
 
 No, is not. Im sure. It is wrong a request to parse  system internal data,
 that is available in structured form. You create string that should be
 parsed same time.
 
 Few functions with OUT parameters are beter than any semistructured string.

That was shot down, for good reasons: we assume that the users of this
are going to want to modify the command before considering it final.
Maybe they want to add a column to each table being created, or they
want to change the tablespace if the table name ends with _big, or
they want to change the schema in which it is created.

This JSON representations lets you receive the table creation data in a
well-known JSON schema; you can tweak individual elements without having
to parse the SQL command.  And when you're done tweaking, there's a
function that lets you produce the SQL command that corresponds to the
original with the tweaks you just did.

(Please note that, thus far, this facility DOES NOT let you change the
table that was created, at least not directly: these event triggers are
run AFTER the creation command has completed.  You can tweak the command
that would be sent to a remote server in a replication swarm, for
example.  Or create a mirror table for audit purposes.  Or perhaps even
generate an ALTER TABLE command for the new table.)

If by few functions with OUT parameters you mean that we need to have
one record type that is able to receive all possible CREATE TABLE
options, so that you can change them as a record in plpgsql, this
doesn't sound too good to me, for three reasons: a) it's going to
require too many types and functions (one per statement type); b)
cramming the stuff in pg_type.h / pg_proc.h is going to be a horrid
task; c) any change is going to require an initdb.

-- 
Alvaro Herrera


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-08 Thread Craig Ringer
On 01/09/2014 04:42 AM, Alvaro Herrera wrote:

 If there's a NULL element when expanding an object, the whole thing
 expands to empty.  For example, if no AUTHORIZATION
 clause is specified, authorization element is still there, but the
 authorization_role element within it is NULL, and so the whole
 AUTHORIZATION clause expands to empty and the resulting command contains
 no authorization clause.

I'd like to see this applied consistently to argument-less clauses like
IF NOT EXISTS too. So the same rules apply.

 IF NOT EXISTS is handled by defining it to either the string IF NOT
 EXISTS or to empty if no such clause was specified.

I'm not keen on this bit. It puts clauses of syntax into value strings
other than the special output key. Those keys aren't easily
distinguished from user data without clause specific knowledge. I'm not
keen on that.

Instead, can't we use your already proposed subclause structure?

{authorization:{authorization_role:some guy,
  output:AUTHORIZATION %i{authorization_role}},
 if_not_exists: {output: IF NOT EXISTS},
 name:some schema,
 output:CREATE SCHEMA %{if_not_exists} %i{name} %{authorization}}

i.e. if_not_exists becomes an object. All clauses are objects, all
non-object values are user data. (right?). If the clause is absent, the
output key is the empty string.

The issue with that (and with your original proposal) is that you can't
tell what these clauses are supposed to be if they're not present in the
original query. You can't *enable* IF NOT EXISTS without pulling
knowledge of that syntax from somewhere else.

Depending on the problem you intend to solve there, that might be fine.

If it isn't, then instead there just needs to be a key to flag such
clauses as present or not.


{authorization:{authorization_role:some guy,
  output:AUTHORIZATION %i{authorization_role}},
 if_not_exists: {output: IF NOT EXISTS
   present: true},
 name:some schema,
 output:CREATE SCHEMA %{if_not_exists} %i{name} %{authorization}}

Am I just over-complicating something simple here?

My reasoning is that it'd be good to be able to easily tell the
difference between *structure* and *user data* in these query trees and
do so without possibly version-specific and certainly
syntax/clause-specific knowledge about the meaning of every key of every
clause.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-08 Thread Alvaro Herrera
Craig Ringer escribió:

 Instead, can't we use your already proposed subclause structure?
 
 {authorization:{authorization_role:some guy,
   output:AUTHORIZATION %i{authorization_role}},
  if_not_exists: {output: IF NOT EXISTS},
  name:some schema,
  output:CREATE SCHEMA %{if_not_exists} %i{name} %{authorization}}
 
 i.e. if_not_exists becomes an object. All clauses are objects, all
 non-object values are user data. (right?). If the clause is absent, the
 output key is the empty string.
 
 The issue with that (and with your original proposal) is that you can't
 tell what these clauses are supposed to be if they're not present in the
 original query. You can't *enable* IF NOT EXISTS without pulling
 knowledge of that syntax from somewhere else.
 
 Depending on the problem you intend to solve there, that might be fine.

Hmm.  This seems like a reasonable thing to do, except that I would like
the output to always be the constant, and have some other way to
enable the clause or disable it.  With your present boolean:
so

if_not_exists: {output: IF NOT EXISTS,
  present: true/false}

In fact, I'm now wondering whether this is a better idea than not
emitting anything when some element in the output expands to NULL; so it
would apply to authorization as well; if the command includes the
clause, it'd be

 {authorization:{authorization_role:some guy,
   present: true,
   output:AUTHORIZATION %i{authorization_role}},

and if there wasn't anything, you'd have

 {authorization:{authorization_role: null,
   present: false,
   output:AUTHORIZATION %i{authorization_role}},

so if you want to turn it on and it wasn't, you need to change both the
present boolean and also set the authorization_role element; and if you
want to turn it off when it was present, just set present to false.

 Am I just over-complicating something simple here?

I think it's a fair point.

 My reasoning is that it'd be good to be able to easily tell the
 difference between *structure* and *user data* in these query trees and
 do so without possibly version-specific and certainly
 syntax/clause-specific knowledge about the meaning of every key of every
 clause.

Sounds reasonable.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-06 Thread Robert Haas
On Fri, Jan 3, 2014 at 9:05 AM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 The other thing that bothers me here is that, while a normalized
 command string sounds great in theory, as soon as you want to allow
 (for example) mapping schema A on node 1 to schema B on node 2, the
 wheels come off: you'll have to deparse that normalized command string
 so you can change out the schema name and then reassemble it back into
 a command string again.  So we're going to parse the user input, then
 deparse it, hand over the results to the application code, which will
 then parse it, modify that, and deparse it again.

 I have considered several ideas on this front, but most of them turn out
 to be useless or too cumbersome to use.  What seems most adequate is to
 build a command string containing certain patterns, and an array of
 replacement values for such patterns; each pattern corresponds to one
 element that somebody might want modified in the command.  As a trivial
 example, a command such as

CREATE TABLE foo (bar INTEGER);

 would return a string like
CREATE TABLE ${table_schema}.${table_name} (bar INTEGER);

 and the replacement array would be
{table_schema = public, table_name = foo}

 If we additionally provide a function to expand the replacements in the
 string, we would have the base funcionality of a normalized command
 string.  If somebody wants to move the table to some other schema, they
 can simply modify the array to suit their taste, and again expand using
 the provided function; this doesn't require parsing SQL.  It's likely
 that there are lots of fine details that need exploring before this is a
 fully workable idea -- I have just started work on it, so please bear
 with me.

 I think this is basically what you call a JSON blob.

I think this direction has some potential.  I'm not sure it's right in
detail.  The exact scheme you propose above won't work if you want to
leave out the schema name altogether, and more generally it's not
going to help very much with anything other than substituting in
identifiers.  What if you want to add a column called satellite_id to
every table that gets created, for example?  What if you want to make
the tables UNLOGGED?  I don't see how that kind of things is going to
work at all cleanly.

What I can imagine that might work is that you get a JSON blob for a
create table statement and then you have a function
make_a_create_table_statement(json) that will turn it back into SQL.
You can pass a modified blob (adding, removing, or changing the
schema_name or any other element) instead and it will do its thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-06 Thread Alvaro Herrera
Robert Haas escribió:

 I think this direction has some potential.  I'm not sure it's right in
 detail.  The exact scheme you propose above won't work if you want to
 leave out the schema name altogether, and more generally it's not
 going to help very much with anything other than substituting in
 identifiers.  What if you want to add a column called satellite_id to
 every table that gets created, for example?  What if you want to make
 the tables UNLOGGED?  I don't see how that kind of things is going to
 work at all cleanly.

Thanks for the discussion.  I am building some basic infrastructure to
make this possible, and will explore ideas to cover these oversights
(not posting anything concrete yet because I expect several iterations
to crash and burn before I have something sensible to post).

 What I can imagine that might work is that you get a JSON blob for a
 create table statement and then you have a function
 make_a_create_table_statement(json) that will turn it back into SQL.
 You can pass a modified blob (adding, removing, or changing the
 schema_name or any other element) instead and it will do its thing.

I agree, except that I would prefer not to have one function for each
DDL statement type; instead we would have a single function that knows
to expand arbitrary strings using arbitrary JSON parameter objects, and
let ddl_rewrite.c (or whatever we call that file) deal with all the
ugliness.

One function per statement would increase the maintenance cost more, I
think (three extra pg_proc entries every time you add a new object type?
Ugh.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-06 Thread Robert Haas
On Mon, Jan 6, 2014 at 1:17 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 I agree, except that I would prefer not to have one function for each
 DDL statement type; instead we would have a single function that knows
 to expand arbitrary strings using arbitrary JSON parameter objects, and
 let ddl_rewrite.c (or whatever we call that file) deal with all the
 ugliness.

Yeah, that might work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2014-01-03 Thread Alvaro Herrera
Robert Haas escribió:

 The other thing that bothers me here is that, while a normalized
 command string sounds great in theory, as soon as you want to allow
 (for example) mapping schema A on node 1 to schema B on node 2, the
 wheels come off: you'll have to deparse that normalized command string
 so you can change out the schema name and then reassemble it back into
 a command string again.  So we're going to parse the user input, then
 deparse it, hand over the results to the application code, which will
 then parse it, modify that, and deparse it again.

I have considered several ideas on this front, but most of them turn out
to be useless or too cumbersome to use.  What seems most adequate is to
build a command string containing certain patterns, and an array of
replacement values for such patterns; each pattern corresponds to one
element that somebody might want modified in the command.  As a trivial
example, a command such as

   CREATE TABLE foo (bar INTEGER);

would return a string like
   CREATE TABLE ${table_schema}.${table_name} (bar INTEGER);

and the replacement array would be
   {table_schema = public, table_name = foo}

If we additionally provide a function to expand the replacements in the
string, we would have the base funcionality of a normalized command
string.  If somebody wants to move the table to some other schema, they
can simply modify the array to suit their taste, and again expand using
the provided function; this doesn't require parsing SQL.  It's likely
that there are lots of fine details that need exploring before this is a
fully workable idea -- I have just started work on it, so please bear
with me.

I think this is basically what you call a JSON blob.

 Finally, I'm very skeptical of the word normalized.  To me, that
 sounds like an alias for modifying the command string in unspecified
 ways that big brother thinks will be useful to event trigger authors.
  Color me skeptical.  What if somebody doesn't want their command
 string normalized?  What if they want it normalized in a way that's
 different from the way that we've chosen to normalize it?  I fear that
 this whole direction amounts to we don't know how to design a real
 API so let's just do surgery on the command string and call whatever
 pops out the API.

You might criticize the example above by saying that I haven't
considered using a JSON array for the list of table elements; in a
sense, I would be being Big Brother and deciding that you (as the user)
don't need to mess up with the column/constraints list in a table you're
creating.  I thought about it and wasn't sure if there was a need to
implement that bit in the first iteration of this implementation.  One
neat thing about this string+replaceables idea is that we can later
change what replaceable elements the string has, thus providing more
functionality (thus, for example, perhaps the column list can be altered
in v2 that was a constant in v1), without breaking existing users of
the v1.

  but there
  is a slight problem for some kind of objects that are represented partly
  as ALTER state during creation; for example creating a table with a
  sequence uses ALTER SEQ/OWNED BY internally at some point.  There might
  be other cases I'm missing, also.  (The REFRESH command is nominally
  also supported.)
 
 There are lots of places in the DDL code where we pass around
 constructed parse trees as a substitute for real argument lists.  I
 expect that many of those places will eventually get refactored away,
 so it's important that this feature does not end up relying on
 accidents of the current code structure.  For example, an
 AlterTableStmt can actually do a whole bunch of different things in a
 single statement: SOME of those are handled by a loop in
 ProcessUtilitySlow() and OTHERS are handled internally by AlterTable.
 I'm pretty well convinced that that division of labor is a bad design,
 and I think it's important that this feature doesn't make that dubious
 design decision into documented behavior.

Yeah, the submitted patch took care of these elements by invoking the
appropriate collection function at all the right places.  Most of it
happened right in ProcessUtilitySlow, but other bits were elsewhere (for
instance, sub-objects created in a complex CREATE SCHEMA command).  I
mentioned the ALTER SEQUENCE example above because that happens in a
code path that wasn't even close to the rest of the stuff.

  Now about the questions I mentioned above:
 
  a) It doesn't work to reverse-parse the statement nodes in all cases;
  there are several unfixable bugs if we only do that.  In order to create
  always-correct statements, we need access to the catalogs for the
  created objects.  But if we are doing catalog access, then it seems to
  me that we can do away with the statement parse nodes completely and
  just reconstruct the objects from catalog information.  Shall we go that
  route?
 
 That works well for CREATE and is definitely appealing in some 

Re: [HACKERS] Add CREATE support to event triggers

2013-11-24 Thread Andrew Tipton
On Thu, Nov 21, 2013 at 2:36 AM, Christopher Browne cbbro...@gmail.com wrote:
 b) What's the best design of the SRF output?  This patch proposes two
 columns, object identity and create statement.  Is there use for
 anything else?  Class/object OIDs perhaps, schema OIDs for objects types
 that have it?  I don't see any immediate need to that info, but perhaps
 someone does.

 Probably an object type is needed as well, to know if it's a table or
 a domain or a sequence or whatever.

 I suspect that what will be needed to make it all usable is some sort of
 structured form.  That is in keeping with Robert Haas' discomfort with
 the normalized form.

 My minor gripe is that you haven't normalized enough (e.g. - it should be
 CREATE TABLE foo.foo2 (acolumn foo.bar), capturing the normalization of
 data types that are referenced).

 But Robert's quite right that users may want more than just to capture that
 literally; they may want to modify it, for instance, by shifting to another
 schema.  And it will be no fun at all if you have to construct an SQL parser
 in order to change it.


It's certainly much easier to transform a structured representation
into a valid SQL command string than it is to do the inverse.

You may be interested in an extension that I'm working on for a
client, which provides relation_create, relation_alter, and
relation_drop event triggers for 9.3:

  https://bitbucket.org/malloclabs/pg_schema_triggers

I decided to create a composite type for each event, which can be
accessed from within the event trigger by calling a special function.
For example, the relation_alter event supplies the relation Oid and
the old and new pg_class rows.  It's easy to then examine the old
vs. new rows and determine what has changed.


Kind regards,
Andrew Tipton


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers

2013-11-20 Thread Christopher Browne
On Fri, Nov 8, 2013 at 10:33 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Hello,

 Attached you can find a very-much-WIP patch to add CREATE info support
 for event triggers (normalized commands).  This patch builds mainly on
 two things:

 1. Dimitri's DDL rewrite patch he submitted way back, in
http://www.postgresql.org/message-id/m2zk1j9c44@2ndquadrant.fr

 I borrowed the whole ddl_rewrite.c code, and tweaked it a bit.  There
 are several things still wrong with it and which will need to be fixed
 before a final patch can even be contemplated; but there are some
 questions that require a consensus answer before I go and fix it all,
 because what it will look like will depend on said answers.

I have tried this out; the patch applies fine.

Note that it induces (modulo my environment) a failure in make check.

The opr_sanity test fails.

postgres@cbbrowne ~/p/s/t/regress diff expected/opr_sanity.out
results/opr_sanity.out
348,350c348,351
  oid | proname
 -+-
 (0 rows)
---
  oid  | proname
 --+--
  3567 | pg_event_trigger_get_normalized_commands
 (1 row)

That's a minor problem; the trouble there is that the new function is not
yet documented.  Not a concern at this stage.

 2. The ideas we used to build DROP support.  Mainly, the interesting
thing here is the fact that we use a SRF to report, at
ddl_command_end, all the objects that were created during execution
of that command.  We do this by collecting them in a list in some raw
form somewhere during ProcessUtility, and then spitting them out if
the SRF is called.  I think the general idea is sound, although of
course I admit there might be bugs in the implementation.

 Note this patch doesn't try to add any kind of ALTER support.  I think
 this is fine in principle, because we agreed that we would attack each
 kind of command separately (divide to conquer and all that); but there
 is a slight problem for some kind of objects that are represented partly
 as ALTER state during creation; for example creating a table with a
 sequence uses ALTER SEQ/OWNED BY internally at some point.  There might
 be other cases I'm missing, also.  (The REFRESH command is nominally
 also supported.)

I imagine that the things we create in earlier stages may help with later
stages, so it's worth *some* planning so we can hope not to build bits
now that push later enhancements into corners that they can't get out of.

But I'm not disagreeing at all.

 Now about the questions I mentioned above:

 a) It doesn't work to reverse-parse the statement nodes in all cases;
 there are several unfixable bugs if we only do that.  In order to create
 always-correct statements, we need access to the catalogs for the
 created objects.  But if we are doing catalog access, then it seems to
 me that we can do away with the statement parse nodes completely and
 just reconstruct the objects from catalog information.  Shall we go that
 route?

Here's a case where it doesn't work.

testevent@localhost-  create schema foo;
CREATE SCHEMA
testevent@localhost-  create domain foo.bar integer;
CREATE DOMAIN
testevent@localhost-  CREATE OR REPLACE FUNCTION snitch() RETURNS
event_trigger LANGUAGE plpgsql AS $$
testevent$# DECLARE
testevent$# r RECORD;
testevent$# BEGIN
testevent$# FOR r IN SELECT * FROM
pg_event_trigger_get_normalized_commands()
testevent$# LOOP
testevent$# RAISE NOTICE 'object created: id %, statement %',
testevent$# r.identity, r.command;
testevent$# END LOOP;
testevent$# END;
testevent$# $$;
CREATE FUNCTION
testevent@localhost-  CREATE EVENT TRIGGER snitch ON ddl_command_end
EXECUTE PROCEDURE snitch();
CREATE EVENT TRIGGER
testevent@localhost-  set search_path to public, foo;
SET
testevent@localhost-  create table foo.foo2 (acolumn bar);
NOTICE:  object created: id foo.foo2, statement CREATE TABLE foo.foo2
(acolumn bar)
CREATE TABLE

The trouble is that you have only normalized the table name.  The
domain, bar, needs its name normalized as well.

 b) What's the best design of the SRF output?  This patch proposes two
 columns, object identity and create statement.  Is there use for
 anything else?  Class/object OIDs perhaps, schema OIDs for objects types
 that have it?  I don't see any immediate need to that info, but perhaps
 someone does.

Probably an object type is needed as well, to know if it's a table or
a domain or a sequence or whatever.

I suspect that what will be needed to make it all usable is some sort of
structured form.  That is in keeping with Robert Haas' discomfort with
the normalized form.

My minor gripe is that you haven't normalized enough (e.g. - it should be
CREATE TABLE foo.foo2 (acolumn foo.bar), capturing the normalization of
data types that are referenced).

But Robert's quite right that users may want more than just to capture that
literally; they may want to modify it, for instance, 

Re: [HACKERS] Add CREATE support to event triggers

2013-11-12 Thread Robert Haas
On Fri, Nov 8, 2013 at 10:33 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Attached you can find a very-much-WIP patch to add CREATE info support
 for event triggers (normalized commands).  This patch builds mainly on
 two things:

 1. Dimitri's DDL rewrite patch he submitted way back, in
http://www.postgresql.org/message-id/m2zk1j9c44@2ndquadrant.fr

 I borrowed the whole ddl_rewrite.c code, and tweaked it a bit.  There
 are several things still wrong with it and which will need to be fixed
 before a final patch can even be contemplated; but there are some
 questions that require a consensus answer before I go and fix it all,
 because what it will look like will depend on said answers.

I'm still unhappy with this whole concept.  It adds a significant
maintenance burden that must be carried by everyone who adds new DDL
syntax in the future, and it's easy to imagine this area of the code
ending up very poorly tested and rife with bugs.  After all, event
triggers, as nifty as they are, are only going to be used by a small
minority of users.  And new DDL features are typically going to be
things that are fairly obscure, so again they will only be used by a
small minority of users.  I think we need to avoid the situation where
we have bugs that can only get found when those minorities happen to
intersect.  If we're going to have DDL rewrite code, then we need a
way of making sure that it gets tested in a comprehensive way on a
regular basis.

Here's one idea: create a contrib module that (somehow, via APIs to be
invented) runs every DDL command that gets executed through the
deparsing code, and then parses the result and executes *that* instead
of the original command.  Then, add a build target that runs the
regression test suite in that mode, and get the buildfarm configured
to run that build target regularly on at least some machines.  That
way, adding syntax to the regular regression test suite also serves to
test that the deparsing logic for that syntax is working.  If we do
this, there's still some maintenance burden associated with having DDL
deparsing code, but at least our chances of noticing when we've failed
to maintain it should be pretty good.

The other thing that bothers me here is that, while a normalized
command string sounds great in theory, as soon as you want to allow
(for example) mapping schema A on node 1 to schema B on node 2, the
wheels come off: you'll have to deparse that normalized command string
so you can change out the schema name and then reassemble it back into
a command string again.  So we're going to parse the user input, then
deparse it, hand over the results to the application code, which will
then parse it, modify that, and deparse it again.  At every step of
that process, any mistake will lead to subtle bugs in the resulting
system.  Larry Wall once wrote (approximately) that a good programming
language makes simple things simple and hard things possible; I think
this design fails the second prong of that test.

Now, I guess I can live with that if it's what everyone else wants,
but I don't have a great feeling about the long-term utility of it.
Exposing the data from the DDL statement in some structured way - like
what we've done with drops, or a JSON blob, or something like that,
feels much more robust and useful than a normalized command string to
me.  If the normalized command string can easily be built up from that
data, then you don't really need to expose the command string
separately.  If it can't, then you're not doing a good job exposing
the data in a usable form.  Saying well, people can always parse the
normalized command string is a cop-out.  Parsing SQL is *hard*; the
only external project I know of that parses our SQL syntax well is
pgpool, and that's because they copy our parser wholesale, surely not
the sort of solution we want to foist off on event trigger authors.

Finally, I'm very skeptical of the word normalized.  To me, that
sounds like an alias for modifying the command string in unspecified
ways that big brother thinks will be useful to event trigger authors.
 Color me skeptical.  What if somebody doesn't want their command
string normalized?  What if they want it normalized in a way that's
different from the way that we've chosen to normalize it?  I fear that
this whole direction amounts to we don't know how to design a real
API so let's just do surgery on the command string and call whatever
pops out the API.  Maybe that's harsh, but if it is I don't know why.
 The normalization steps we build into this process constitute
assumptions about how the feature will be used, and they back the user
into using that feature in just that way and no other.

 2. The ideas we used to build DROP support.  Mainly, the interesting
thing here is the fact that we use a SRF to report, at
ddl_command_end, all the objects that were created during execution
of that command.  We do this by collecting them in a list in some raw
form