Re: [HACKERS] Command Triggers patch v18

2012-04-03 Thread Robert Haas
On Sun, Apr 1, 2012 at 7:22 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 See above example: I am pretty sure you need a stack.

 In next version, certainly. As of now I'm willing to start a new stack
 in each command executed in a command trigger. That means 9.2 will only
 expose the first level of the stack, I guess.

I think we're talking past each other.  If someone executes DDL
command A and the command trigger executes DDL command B which fires
another command trigger, then the command trigger for A needs to see
the information relevant to A both before and after the command
trigger for B executes.  So you can't just store all the context
information in a flat global variable, because otherwise when the
trigger for B executes it will clobber the values for A.  You need to
do a save/restore so that when the execution of the trigger on A
resumes, it still sees the right stuff.

 Well it depends on what you're achieving with replication, this term
 includes so many different use cases… What I want core to provide is the
 mechanism that allows implementing the replication you need.

Agreed.

 See above - generally, I think that it's useful for a command trigger
 to know that it's being called because of a DDL event, rather than
 some command that could be doing anything.  Also, I think that wanting
 to hook all DDL commands is likely to be a useful thing to do, and
 without having to explicitly list 50 command names...

 Yeah, just omit the WHEN clause then.

Well, that's absolutely everything rather than just all DDL.
What's the use case for that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-04-03 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I think we're talking past each other.  If someone executes DDL
 command A and the command trigger executes DDL command B which fires
 another command trigger, then the command trigger for A needs to see
 the information relevant to A both before and after the command
 trigger for B executes.  So you can't just store all the context
 information in a flat global variable, because otherwise when the
 trigger for B executes it will clobber the values for A.  You need to
 do a save/restore so that when the execution of the trigger on A
 resumes, it still sees the right stuff.

Agreed. I was trying to say that we won't have all of this in 9.2.

 Yeah, just omit the WHEN clause then.

 Well, that's absolutely everything rather than just all DDL.
 What's the use case for that?

You can still filter in the trigger code.  A use case is forbidding any
command that's not purely data related to happen in certain cases, like
live server migrations.

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-04-01 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 How about calling it command tag?  I think both context and toplevel
 are inconsistent with other uses of those terms: context is an
 error-reporting field, among other things; and we don't care about the
 toplevel command, just the command-tag of the one we're executing -
 e.g. if DROP fires a command trigger which invokes CREATE which fires
 another command trigger, the inner one is going to get CREATE not
 DROP.  Or at least so I presume.

It's not about that though, it's about a DROP TYPE that cascades to a
DROP FUNCTION, or a DROP SCHEMA that cascades to 10 DROP TABLE. I want
to know in each cascaded DROP TABLE that it's happening as a result of
DROP SCHEMA ... CASCADE, so I'm calling that a top-level command.

 See above example: I am pretty sure you need a stack.

In next version, certainly. As of now I'm willing to start a new stack
in each command executed in a command trigger. That means 9.2 will only
expose the first level of the stack, I guess.

Also there's a difference in CASCADE (no new command emitted) and in an
event trigger that executes a new top-level command.

 I would not want my replication system issuing cascaded drops, because
 if the sides don't match it might cascade to something on the remote
 side that it doesn't cascade to on the local side, which exceeds my
 tolerance for scary behavior.

Well it depends on what you're achieving with replication, this term
includes so many different use cases… What I want core to provide is the
mechanism that allows implementing the replication you need.

 There are far too many variants and cases of our command to be able to
 extract their parameters in a flat way (a bunch of variables compared to
 a nested description ala json or xml), and I don't think such a flat
 representation is going to be much better than the parse tree.

 I strongly disagree.  I think we'll find that with the right choice of
 hook points, the number of variables that need to be exposed is quite
 compact.  Indeed, I'd venture to say that needing to pass lots and
 lots of information is evidence that you've made a poor choice of hook
 point.

Currently we're exposing a very limited set of variables. So I think
we're good in your book.

 No, but whether or not you mention it in the CREATE TRIGGER syntax has
 nothing to do with whether it's available as a magic parameter inside
 the procedure.  Those things out to be independent.  I imagine that
 the stuff that is accessible from inside the trigger will be richer
 than what you can do in the trigger syntax itself.

Exactly, we're in agreement here.

 I'm still not sold on the idea of lumping together every command under
 a single command_start event.  I can't see anyone wanting to hook
 anything that broad.  Don't forget that we need to document not only
 which triggers will fire but also what magic variables they'll get. A

Yeah, and command start triggers are only going to have tag, toplevel
tag or whatever the right name of that is, and parse tree if it's
written in C. And that's it. The command start event trigger are
typically fired directly from utility.c.

 dcl_command_start hook could conceivably get the list of privileges
 being granted or revoked, but a general command_start trigger is going
 to need a different set of magic variables depending on the actual
 command type.  I think it might be simpler and more clear to say that
 each event type provides these variables, rather than having to
 conditionalize it based on the command type.

That's going to be true for other event timing specs, but not for the
command start as I picture it.

 See above - generally, I think that it's useful for a command trigger
 to know that it's being called because of a DDL event, rather than
 some command that could be doing anything.  Also, I think that wanting
 to hook all DDL commands is likely to be a useful thing to do, and
 without having to explicitly list 50 command names...

Yeah, just omit the WHEN clause then.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-31 Thread Robert Haas
On Fri, Mar 30, 2012 at 11:19 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Yeah context is not explicit, we could call that toplevel: the command
 tag of the command that the user typed. When toplevel is null, the event
 trigger is fired on a command the user sent, when it's not null, the
 trigger is fired on some inner command operation.

How about calling it command tag?  I think both context and toplevel
are inconsistent with other uses of those terms: context is an
error-reporting field, among other things; and we don't care about the
toplevel command, just the command-tag of the one we're executing -
e.g. if DROP fires a command trigger which invokes CREATE which fires
another command trigger, the inner one is going to get CREATE not
DROP.  Or at least so I presume.

 tag.  I think for a drop trigger I would want the function to
 receive this information: type of object dropped, OID of object
 dropped, column number in the case of a column drop, flag indicating
 whether it's a toplevel drop or a cascaded drop.  I wouldn't object to
 also making the currently-in-context toplevel command tag available,
 but I think most drop triggers wouldn't really care, so I wouldn't
 personally spend much implementation effort on it if it turns out to
 be hard.

 I'm not sure it would be hard as I'm only seeing a single depth possible
 here, so a single per-backend string static variable would do.

See above example: I am pretty sure you need a stack.

 But in general, I don't really know what a proper subcommand is or
 why some subcommands should be more proper than others, or why we
 should even be concerned about whether something is a subcommand at
 all.  I think it's fine and useful to have triggers that fire in those
 kinds of places, but I don't see why we should limit ourselves to
 that.  For applications like replication, auditing, and enhanced
 security, the parse tree and subcommand/non-subcommand status of a
 particular operation are irrelevant.  What you need is an exact

 Not really. When replicating you could perfectly say that you only
 replicate the toplevel DROP because the replica will also do the cascade
 dance and you might have decided not to replicate all related objects on
 the other side.

I would not want my replication system issuing cascaded drops, because
if the sides don't match it might cascade to something on the remote
side that it doesn't cascade to on the local side, which exceeds my
tolerance for scary behavior.

 The information you need really want not to miss is when only the
 cascaded object is part of the replication, not the main one. That was
 not covered by my previous patch but now we have a way to cover it.

Also true.

 description of the operation that got performed (e.g. the default on
 table X column Y got dropped); you might be able to reverse-engineer
 that from the parse tree, but it's much better to have the system pass
 you the information you need more directly.  Certainly, there are
 cases where you might want to have the parse tree, or even the raw
 command text, available, but I'm not even convinced that that those
 cases will be the most commonly used ones unless, of course, they're
 the only ones we offer, in which case everyone will go down that path
 by necessity.

 There are far too many variants and cases of our command to be able to
 extract their parameters in a flat way (a bunch of variables compared to
 a nested description ala json or xml), and I don't think such a flat
 representation is going to be much better than the parse tree.

I strongly disagree.  I think we'll find that with the right choice of
hook points, the number of variables that need to be exposed is quite
compact.  Indeed, I'd venture to say that needing to pass lots and
lots of information is evidence that you've made a poor choice of hook
point.

 Now, we will later be able to offer a normalized rewritten command
 string from the parse tree to the use, but I don't see us adding support
 for that from cascaded drops, one other reason why I like to expose them
 as sub commands.

 Again, I'm not understanding the distinction between toplevel events
 and sub-events.  I don't see any need for such a distinction.  I think
 there are just events, and some of them happen at command start/end
 and others happen somewhere in the middle.  As long as it's a safe and
 useful place to fire a trigger, who cares?

 I guess you're head is too heavily in the code side of things as opposed
 to the SQL user view point. Maybe my attempt to conciliate both views is
 not appropriate, but I really do think it is.

 Given the scope of this mini expression language, we can easily bypass
 calling the executor in v1 here, and reconsider later if we want to
 allow calling a UDF in the WHEN clause… I don't think it's an easy
 feature to add in, though.

 Or a necessary one.  AFAICS, the main benefit of WHEN clauses on

 Exactly.

 regular triggers is that you can prevent the AFTER trigger queue from
 

Re: [HACKERS] Command Triggers patch v18

2012-03-30 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I'm thinking of things like extension whitelisting.  When some
 unprivileged user says CREATE EXTENSION harmless, and harmless is
 marked as superuser-only, we might like to have a hook that gets
 called *at permissions-checking time* and gets to say, oh, well, that
 extension is on the white-list, so we're going to allow it.  I think
 you can come up with similar cases for other commands, where in
 general the operation is restricted to superusers or database owners
 or table owners but in specific cases you want to allow others to do
 it.

I did that another way in previous incarnations of the patch, which was
to allow for INSTEAD OF event trigger backed by a SECURITY DEFINER
function. When the extension is whitelisted, prevent against recursion
then CREATE EXTENSION in the security definer function, then signal that
the execution should now be aborted.

That was too dangerous given the lack of policy about where exactly the
user code is fired, but I think we could now implement that for some of
the event timing specs we're listing. Only some of them, I guess only
those that are happening before we lock the objects.

I would then prefer using the INSTEAD OF words that are way more easy to
grasp than AT.

 CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...])
 EXECUTE PROCEDURE function_name(args);

  create event trigger prohibit_some_ddl
     preceding timing spec
          when tag in ('CREATE TABLE', 'ALTER TABLE')
       execute procedure throw_an_error();

 I guess that would make sense if you think there would ever be more
 than one choice for trigger variable.  I'm not immediately seeing a
 use case for that, though - I was explicitly viewing the syntax foo

So, the variables in question are tag, objectid, objectname, schemaname
and from a very recent email context. On reflexion, I think the variable
here would only be either tag or context, and that's it.

 More generally, my thought on the structure of this is that you're
 going to have certain toplevel events, many of which will happen at
 only a single place in the code, like an object got dropped or a
 DDL command started or a DDL command ended.  So we give those
 names, like sql_drop, ddl_command_start, and ddl_command_end.  Inside

I really dislike mixing sql_drop and ddl_command_start as being the same
kind of objects here, even if I can bend my head in the right angle and
see that it's a fair view when looking at how it's implemented. I can't
see a way to explain that to users without having to explain them how
drop cascade is implemented.

So my proposal here is to “fake” a “proper“ subcommand thanks to the new
context variable. If you DROP TYPE foo CASCADE and that in turn drops a
function foo_in(), then an event trigger is fired with

   context = 'DROP TYPE'
   tag = 'DROP FUNCTION'

Same idea when you DROP TABLE … CASCADE and a SEQUENCE and a bunch of
index need to disappear too, you get an usual event trigger fired with
the context set to 'DROP TABLE' this time.

I don't think we need to arrange for explicitly publishing the context
specific information here. If we need to, we have to find the right
timing spec where we can guarantee still being in the top level command
and where we already have the details filled in, then users can attach a
trigger here and register the information for themselves.

 your trigger procedure, the set of magic variables that is available
 will depend on which toplevel event you set the trigger on, but
 hopefully all firings of that toplevel event can provide the same
 magic variables.  For example, at ddl_command_start time, you're just
 gonna get the command tag, but at ddl_command_end time you will get
 the command tag plus maybe some other stuff.

With my proposal above, you could get the same set of information when
being called as a toplevel event or a subevent (one where the context is
not null). That would mean adding object name and schema name lokkups in
the drop cascade code, though. We can also decide not to do that extra
lookup and just publish the object id which we certainly do have.

This way, the timing spec of a sub-event can still be of the same kind
as the top-level event ones, we still have before and after lock entry
points, same with lookup if we add that feature, etc.

 Now, we COULD stop there.  I mean, we could document that you can
 create a trigger on ddl_command_start and every DDL command will fire
 that trigger, and if the trigger doesn't care about some DDL
 operations, then it can look at the command tag and return without
 doing anything for the operations it doesn't care about.  The only
 real disadvantage of doing it that way is speed, and maybe a bit of
 code complexity within the trigger.  So my further thought was that

Within my “context proposal”, you also lose the ability to refer to sub
events as plain events with a context, which I find so much cleaner.

 we'd allow you to specify an optional filter 

Re: [HACKERS] Command Triggers patch v18

2012-03-30 Thread Robert Haas
On Fri, Mar 30, 2012 at 4:32 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 I did that another way in previous incarnations of the patch, which was
 to allow for INSTEAD OF event trigger backed by a SECURITY DEFINER
 function. When the extension is whitelisted, prevent against recursion
 then CREATE EXTENSION in the security definer function, then signal that
 the execution should now be aborted.

 That was too dangerous given the lack of policy about where exactly the
 user code is fired, but I think we could now implement that for some of
 the event timing specs we're listing. Only some of them, I guess only
 those that are happening before we lock the objects.

Oh, right: I remember that now.  I still think it's a bad way to do
it, because the trigger potentially has a lot of work to do to
reconstruct a working command string, and it still ends up getting
executed by the wrong user.  For CREATE EXTENSION it's not that bad,
because the arguments to the command are so simple, but of course any
time we extend the CREATE EXTENSION syntax, the trigger needs to know
about it too whether it's security-relevant or not, and doing
something similar with, say, ALTER TABLE would be a ridiculously
complicated.  I think there is a use case for what you called an
INSTEAD OF trigger, but I don't believe in this one.  It seems to me
that there's a lot of power in being able to *just* intercept the
security decision and then let the rest of the command go about its
business.  Of course, you have to avoid getting security checks (like,
you must own the table in order to drop a column) with integrity
checks (like, you can't drop a column from pg_class) but I think
that's not very hard to get right.

 More generally, my thought on the structure of this is that you're
 going to have certain toplevel events, many of which will happen at
 only a single place in the code, like an object got dropped or a
 DDL command started or a DDL command ended.  So we give those
 names, like sql_drop, ddl_command_start, and ddl_command_end.  Inside

 I really dislike mixing sql_drop and ddl_command_start as being the same
 kind of objects here, even if I can bend my head in the right angle and
 see that it's a fair view when looking at how it's implemented. I can't
 see a way to explain that to users without having to explain them how
 drop cascade is implemented.

 So my proposal here is to “fake” a “proper“ subcommand thanks to the new
 context variable. If you DROP TYPE foo CASCADE and that in turn drops a
 function foo_in(), then an event trigger is fired with

   context = 'DROP TYPE'
   tag = 'DROP FUNCTION'

 Same idea when you DROP TABLE … CASCADE and a SEQUENCE and a bunch of
 index need to disappear too, you get an usual event trigger fired with
 the context set to 'DROP TABLE' this time.

 I don't think we need to arrange for explicitly publishing the context
 specific information here. If we need to, we have to find the right
 timing spec where we can guarantee still being in the top level command
 and where we already have the details filled in, then users can attach a
 trigger here and register the information for themselves.

I'm not sure I understand how you're using the words context and
tag.  I think for a drop trigger I would want the function to
receive this information: type of object dropped, OID of object
dropped, column number in the case of a column drop, flag indicating
whether it's a toplevel drop or a cascaded drop.  I wouldn't object to
also making the currently-in-context toplevel command tag available,
but I think most drop triggers wouldn't really care, so I wouldn't
personally spend much implementation effort on it if it turns out to
be hard.

But in general, I don't really know what a proper subcommand is or
why some subcommands should be more proper than others, or why we
should even be concerned about whether something is a subcommand at
all.  I think it's fine and useful to have triggers that fire in those
kinds of places, but I don't see why we should limit ourselves to
that.  For applications like replication, auditing, and enhanced
security, the parse tree and subcommand/non-subcommand status of a
particular operation are irrelevant.  What you need is an exact
description of the operation that got performed (e.g. the default on
table X column Y got dropped); you might be able to reverse-engineer
that from the parse tree, but it's much better to have the system pass
you the information you need more directly.  Certainly, there are
cases where you might want to have the parse tree, or even the raw
command text, available, but I'm not even convinced that that those
cases will be the most commonly used ones unless, of course, they're
the only ones we offer, in which case everyone will go down that path
by necessity.

 your trigger procedure, the set of magic variables that is available
 will depend on which toplevel event you set the trigger on, but
 hopefully all firings of that toplevel event can 

Re: [HACKERS] Command Triggers patch v18

2012-03-30 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Oh, right: I remember that now.  I still think it's a bad way to do
 it, because the trigger potentially has a lot of work to do to
 reconstruct a working command string, and it still ends up getting
 executed by the wrong user.  For CREATE EXTENSION it's not that bad,

That's true, I'm only kind of saying that the INSTEAD OF keyword still
makes sense (instead of security_checks). I agree that the feature is
simpler to use the way to propose it.

   context = 'DROP TYPE'
   tag = 'DROP FUNCTION'

 I'm not sure I understand how you're using the words context and

Yeah context is not explicit, we could call that toplevel: the command
tag of the command that the user typed. When toplevel is null, the event
trigger is fired on a command the user sent, when it's not null, the
trigger is fired on some inner command operation.

 tag.  I think for a drop trigger I would want the function to
 receive this information: type of object dropped, OID of object
 dropped, column number in the case of a column drop, flag indicating
 whether it's a toplevel drop or a cascaded drop.  I wouldn't object to
 also making the currently-in-context toplevel command tag available,
 but I think most drop triggers wouldn't really care, so I wouldn't
 personally spend much implementation effort on it if it turns out to
 be hard.

I'm not sure it would be hard as I'm only seeing a single depth possible
here, so a single per-backend string static variable would do.

 But in general, I don't really know what a proper subcommand is or
 why some subcommands should be more proper than others, or why we
 should even be concerned about whether something is a subcommand at
 all.  I think it's fine and useful to have triggers that fire in those
 kinds of places, but I don't see why we should limit ourselves to
 that.  For applications like replication, auditing, and enhanced
 security, the parse tree and subcommand/non-subcommand status of a
 particular operation are irrelevant.  What you need is an exact

Not really. When replicating you could perfectly say that you only
replicate the toplevel DROP because the replica will also do the cascade
dance and you might have decided not to replicate all related objects on
the other side.

The information you need really want not to miss is when only the
cascaded object is part of the replication, not the main one. That was
not covered by my previous patch but now we have a way to cover it.

 description of the operation that got performed (e.g. the default on
 table X column Y got dropped); you might be able to reverse-engineer
 that from the parse tree, but it's much better to have the system pass
 you the information you need more directly.  Certainly, there are
 cases where you might want to have the parse tree, or even the raw
 command text, available, but I'm not even convinced that that those
 cases will be the most commonly used ones unless, of course, they're
 the only ones we offer, in which case everyone will go down that path
 by necessity.

There are far too many variants and cases of our command to be able to
extract their parameters in a flat way (a bunch of variables compared to
a nested description ala json or xml), and I don't think such a flat
representation is going to be much better than the parse tree.

Now, we will later be able to offer a normalized rewritten command
string from the parse tree to the use, but I don't see us adding support
for that from cascaded drops, one other reason why I like to expose them
as sub commands.

 Again, I'm not understanding the distinction between toplevel events
 and sub-events.  I don't see any need for such a distinction.  I think
 there are just events, and some of them happen at command start/end
 and others happen somewhere in the middle.  As long as it's a safe and
 useful place to fire a trigger, who cares?

I guess you're head is too heavily in the code side of things as opposed
to the SQL user view point. Maybe my attempt to conciliate both views is
not appropriate, but I really do think it is.

 Given the scope of this mini expression language, we can easily bypass
 calling the executor in v1 here, and reconsider later if we want to
 allow calling a UDF in the WHEN clause… I don't think it's an easy
 feature to add in, though.

 Or a necessary one.  AFAICS, the main benefit of WHEN clauses on

Exactly.

 regular triggers is that you can prevent the AFTER trigger queue from
 getting huge, and maybe a save a little on the cost of invoking a
 trigger function just to exit again.  But neither of those should be
 relevant here - nobody does that much DDL, and anybody writing command
 triggers should understand that this is advanced magic not intended
 for beginners.  Wizards below level 10 need not apply.

Here it's only a facility to manage your event trigger code
organization, and I insisted in having it in the syntax because in the
event trigger grammar I don't see another place where to stuff 

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Robert Haas
On Wed, Mar 28, 2012 at 3:41 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 What about :

  create command trigger foo before prepare alter table …
  create command trigger foo before start of alter table …
  create command trigger foo before execute alter table …
  create command trigger foo before end of alter table …

  create command trigger foo when prepare alter table …
  create command trigger foo when start alter table …
  create command trigger foo when execute of alter table …
  create command trigger foo when end of alter table …

  create command trigger foo preceding alter table …
  create command trigger foo preceding alter table … deferred
  create command trigger foo preceding alter table … immediate

  create command trigger foo following parser of alter table …
  create command trigger foo preceding execute of alter table …

  create command trigger foo following alter table …

  create command trigger foo before alter table nowait …

 I'm not sure how many hooks we can really export here, but I guess we
 have enough existing keywords to describe when they get to run (parser,
 mapping, lock, check, begin, analyze, access, disable, not exists, do,
 end, escape, extract, fetch, following, search…)

I am sure that we could find a way to beat this with a stick until it
behaves, but I don't really like that idea.  It seems to me to be a
case of adding useless parser bloat.  Prior to 9.0, when we introduced
the new EXPLAIN syntax, new EXPLAIN options were repeatedly proposed
and partly rejected on the grounds that they weren't important enough
to justify adding new keywords.  We've now got EXPLAIN (COSTS,
BUFFERS, TIMING, FORMAT) and there will probably be more in the
future, and the parser overhead of adding a new one is zero.  I think
we should learn from that lesson: when you may want to have a bunch of
different options and it's not too clear what they're all going to be
named, designing things in a way that avoids a dependency on the main
parser having the right keywords leads to less patch rejection and
more getting stuff done.

 I've also had another general thought about safety: we need to make
 sure that we're only firing triggers from places where it's safe for
 user code to perform arbitrary actions.  In particular, we have to
 assume that any triggers we invoke will AcceptInvalidationMessages()
 and CommandCounterIncrement(); and we probably can't do it at all (or
 at least not without some additional safeguard) in places where the
 code does CheckTableNotInUse() up through the point where it's once
 again safe to access the table, since the trigger might try.  We also

 I've been trying to do that already.

 need to think about re-entrancy: that is, in theory, the trigger might
 execute any other DDL command - e.g. it might drop the table that
 we've been asked to rename.  I suspect that some of the current BEFORE

 That's why I implemented ALTER COMMAND TRIGGER ... SET DISABLE in the
 first place, so that you could run the same command from the trigger
 itself without infinite recursion.

 trigger stuff might fall down on that last one, since the existing
 code not-unreasonably expects that once it's locked the table, the
 catalog entries can't go away.  Maybe it all happens to work out OK,
 but I don't think we can count on that.

 I didn't think of DROP TABLE in a command table running BEFORE ALTER
 TABLE, say. That looks evil. It could be documented as yet another way
 to shoot yourself in the foot though?

Well, it depends somewhat on how it fails.  If it fails by crashing
the server, for example, I don't think that's going to fly.  My
suspicion is that it won't do that, but what it might do is fail in
some pretty odd and unpredictable ways, possibly leading to catalog
corruption, which I don't feel too good about either.  Think about not
just ALTER vs. DROP but also ALTER vs. ALTER.  It's probably easier to
add guards against this kind of thing than it is to prove that it's
not going to do anything too wacky; the obvious idea that comes to
mind is to bump the command counter after returning from the last
trigger (if that doesn't happen already) and then verify that the
tuple is still present and has whatever other properties we've already
checked and are counting on, and if not chuck an error.

I think that for a first version of this feature it probably would be
smart to trim this back to something that doesn't involve surgery on
the guts of every command; then, we can extend incrementally.  Nothing
you've proposed seems impossible to me, but most of the really
interesting things are hard, and it would be much easier to handle
patches intended to cater to more complex use cases if the basic
infrastructure were already committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I am sure that we could find a way to beat this with a stick until it
 behaves, but I don't really like that idea.  It seems to me to be a
[...]
 we should learn from that lesson: when you may want to have a bunch of

I first wanted to ensure that reusing existing parser keyword wouldn't
be well enough for our case as I find that solution more elegant. If we
can't think of future places where we would want to add support for
calling command triggers, then yes, you're right, something more
flexible is needed.

I'll go make that happen, and still need input here. We first want to
have command triggers on specific commands or ANY command, and we want
to implement 3 places from where to fire them.

Here's a new syntax proposal to cope with that:

 create command trigger before COMMAND_STEP of alter table
  execute procedure snitch();

 - before the process utility switch, with only command tag and parse
   tree

 create command trigger foo before start of alter table
  execute procedure snitch();

 - before running the command, after having done basic error checks,
   security checks, name lookups and locking, with all information

 create command trigger before execute of alter table
  execute procedure snitch();

 - after having run the command

 create command trigger foo before end of alter table
  execute procedure snitch();

Some other places make sense to add support for, but working within the
delays we have and with the current patch, those 3 points are what's
easy enough to implement now.

 I didn't think of DROP TABLE in a command table running BEFORE ALTER
 TABLE, say. That looks evil. It could be documented as yet another way
 to shoot yourself in the foot though?

 I think that for a first version of this feature it probably would be
 smart to trim this back to something that doesn't involve surgery on
 the guts of every command; then, we can extend incrementally.

A simple enough solution here would be to register a list disabled
commands per objectid before running the user defined function, and
check against that list before allowing it to run.

As we have command trigger support for all the commands we want to be
able to block, the lookup and ereport() call can be implemented in
InitCommandContext() where we already apply some limitations (like no
command triggers on command trigger commands).

Once that is in place it's possible to refine the pre-condition checks
on a per command basis and stop adding the command from the black list.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Thom Brown
On 29 March 2012 13:30, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 I'll go make that happen, and still need input here. We first want to
 have command triggers on specific commands or ANY command, and we want
 to implement 3 places from where to fire them.

 Here's a new syntax proposal to cope with that:

     create command trigger before COMMAND_STEP of alter table
          execute procedure snitch();

  - before the process utility switch, with only command tag and parse
   tree

     create command trigger foo before start of alter table
          execute procedure snitch();

  - before running the command, after having done basic error checks,
   security checks, name lookups and locking, with all information

     create command trigger before execute of alter table
          execute procedure snitch();

  - after having run the command

     create command trigger foo before end of alter table
          execute procedure snitch();

Is it necessary to add this complexity in this version?  Can't we keep
it simple but in a way that allows the addition of this later?  The
testing of all these new combinations sounds like a lot of work.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Robert Haas
On Thu, Mar 29, 2012 at 9:01 AM, Thom Brown t...@linux.com wrote:
 On 29 March 2012 13:30, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 I'll go make that happen, and still need input here. We first want to
 have command triggers on specific commands or ANY command, and we want
 to implement 3 places from where to fire them.

 Here's a new syntax proposal to cope with that:

     create command trigger before COMMAND_STEP of alter table
          execute procedure snitch();

  - before the process utility switch, with only command tag and parse
   tree

     create command trigger foo before start of alter table
          execute procedure snitch();

  - before running the command, after having done basic error checks,
   security checks, name lookups and locking, with all information

     create command trigger before execute of alter table
          execute procedure snitch();

  - after having run the command

     create command trigger foo before end of alter table
          execute procedure snitch();

 Is it necessary to add this complexity in this version?  Can't we keep
 it simple but in a way that allows the addition of this later?  The
 testing of all these new combinations sounds like a lot of work.

I concur.  This is way more complicated than we should be trying to do
in version 1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Robert Haas
On Thu, Mar 29, 2012 at 8:30 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 I'll go make that happen, and still need input here. We first want to
 have command triggers on specific commands or ANY command, and we want
 to implement 3 places from where to fire them.

 Here's a new syntax proposal to cope with that:

     create command trigger before COMMAND_STEP of alter table
          execute procedure snitch();

  - before the process utility switch, with only command tag and parse
   tree

     create command trigger foo before start of alter table
          execute procedure snitch();

  - before running the command, after having done basic error checks,
   security checks, name lookups and locking, with all information

     create command trigger before execute of alter table
          execute procedure snitch();

  - after having run the command

     create command trigger foo before end of alter table
          execute procedure snitch();

One thought is that it might be better to say AT or ON or WHEN rather
than BEFORE, since BEFORE END is just a little strange; and also
because a future hook point might be something like
permissions-checking, and we really want it to be AT
permissions-checking time, not BEFORE permissions-checking time.

If I got to pick, I'd pick this syntax:

CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...])
EXECUTE PROCEDURE function_name(args);

Then you could eventually imagine:

CREATE EVENT TRIGGER prohibit_all_ddl ON ddl_command_start EXECUTE
PROCEDURE throw_an_error();
CREATE EVENT TRIGGER prohibit_some_ddl ON ddl_command_start
(alter_table) EXECUTE PROCEDURE throw_an_error();
CREATE EVENT TRIGGER replicate_table_drops ON sql_drop (table) EXECUTE
PROCEDURE record_table_drop();
CREATE EVENT TRIGGER allow_whitelisted_extensions ON permissions_check
(create_extension) EXECUTE PROCEDURE make_heroku_happy();
CREATE EVENT TRIGGER replicate_at_subcommands ON subcommand_start
(alter_table) EXECUTE PROCEDURE record_alter_table_subcommand();

Piece by piece:

- I prefer EVENT to COMMAND, because the start or end or any other
point during the execution of a command can be viewed as an event;
however, it is pretty clear that not everything that can be viewed as
an event upon which we might like triggers to fire can be viewed as a
command, even if we have a somewhat well-defined notion of
subcommands.  I continue to think that there is no sense in which
creating a sequence to back a serial column or dropping an object due
to DROP .. CASCADE is a subcommand; it is, however, pretty clearly an
event.  Anything we want can be an event.  I think if we don't get
this right in the first version we're going to be stuck with a
misnomer forever.  :-(

- It is pretty clear that for triggers that are supposed to fire at
the start or end of a command, it's useful to be able to specify which
commands it fires for.  However, there must be other types of events
(as in my postulated sql_drop case above) where the name of the
command is not relevant - people aren't going to find all the objects
that got dropped as a result of a toplevel drop table command; they're
going to want to find all the tables that got dropped regardless of
which incantation did it.  Also keep in mind that you can do things
like use ALTER TABLE to rename a view (and there are other cases of
that sort of confusion with domains vs. types, aggregates vs.
functions, etc), so being able to filter based on object type seems
clearly better in a bunch of real-world cases.  And, even if you don't
believe that specific example, a general syntax leaves us with freedom
to maneuver if we discover other needs in the future.  Writing
alter_table etc. as tokens rather than as ALTER TABLE also has the
further advantages of (1) greatly decreasing the parser footprint of
this feature and (2) relieving everyone who adds commands in the
future of the need to also change the command-trigger grammar.

- I don't think that triggers at what Dimitri is calling before
execute are a good idea at all for the first version of this feature,
because there is a lot of stuff that might break and examining that
should really be a separate project from getting the basic
infrastructure in place.  However, I also don't like the name.  Unlike
queries, DDL commands don't have clearly separate planning and
execution phases, so saying that something happens before execution
isn't really very clear.  As previously discussed, the hook points in
the latest patch are not all entirely consistent between one command
and the next, not only in the larger ways I've already pointed out but
also in some smaller ways.  Different commands do different kinds of
integrity checks and the firing points are not always in exactly the
same place in that sequence.  Of course, not all the commands do all
the checks in the same order, so some possibly-not-trivial refactoring
is probably required to get that consistent.  Moreover, it doesn't
seem out of reach to 

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 On Thu, Mar 29, 2012 at 9:01 AM, Thom Brown t...@linux.com wrote:
 On 29 March 2012 13:30, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 I'll go make that happen, and still need input here. We first want to
 have command triggers on specific commands or ANY command, and we want
 to implement 3 places from where to fire them.

 Here's a new syntax proposal to cope with that:

 Is it necessary to add this complexity in this version?  Can't we keep
 it simple but in a way that allows the addition of this later?  The
 testing of all these new combinations sounds like a lot of work.

 I concur.  This is way more complicated than we should be trying to do
 in version 1.

I'm at a loss here. This proposal was so that we can reach a commonly
agreed minimal solution and design in first version. There's no new
piece of infrastructure to add, the syntax is changed only to open the
road for later, I'm not changing where the current command triggers are
to be called (except for those which are misplaced).

So, please help me here: what do we want to have in 9.3?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Thom Brown
On 29 March 2012 16:30, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Mar 29, 2012 at 9:01 AM, Thom Brown t...@linux.com wrote:
 On 29 March 2012 13:30, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 I'll go make that happen, and still need input here. We first want to
 have command triggers on specific commands or ANY command, and we want
 to implement 3 places from where to fire them.

 Here's a new syntax proposal to cope with that:

 Is it necessary to add this complexity in this version?  Can't we keep
 it simple but in a way that allows the addition of this later?  The
 testing of all these new combinations sounds like a lot of work.

 I concur.  This is way more complicated than we should be trying to do
 in version 1.

 I'm at a loss here. This proposal was so that we can reach a commonly
 agreed minimal solution and design in first version. There's no new
 piece of infrastructure to add, the syntax is changed only to open the
 road for later, I'm not changing where the current command triggers are
 to be called (except for those which are misplaced).

 So, please help me here: what do we want to have in 9.3?

Perhaps I misunderstood.  It was the addition of the fine-grained even
options (parse, execute etc) I saw as new.  If you're saying those are
just possible options for later that won't be in this version, I'm
fine with that.  If those are to make it for 9.2, then creating the
necessary test cases and possible fixes sounds infeasible in such a
short space of time.  Please disregard if this is not the case.

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
     create command trigger before COMMAND_STEP of alter table
          execute procedure snitch();

 One thought is that it might be better to say AT or ON or WHEN rather
 than BEFORE, since BEFORE END is just a little strange; and also
 because a future hook point might be something like
 permissions-checking, and we really want it to be AT
 permissions-checking time, not BEFORE permissions-checking time.

Yeah I tried different spellings and almost sent a version using AT or
WHEN, but it appeared to me not to be specific enough: AT permission
checking time does not exist, it either happens before or after
permission checking. I played with using “preceding” rather than before,
too, maybe you would like that better.

So I would document the different steps and their ordering, then use
only BEFORE as a qualifier, and add a pseudo step which is the end of
the command.

 If I got to pick, I'd pick this syntax:

 CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...])
 EXECUTE PROCEDURE function_name(args);

 Then you could eventually imagine:

 CREATE EVENT TRIGGER prohibit_all_ddl ON ddl_command_start EXECUTE
 PROCEDURE throw_an_error();
 CREATE EVENT TRIGGER prohibit_some_ddl ON ddl_command_start
 (alter_table) EXECUTE PROCEDURE throw_an_error();
 CREATE EVENT TRIGGER replicate_table_drops ON sql_drop (table) EXECUTE
 PROCEDURE record_table_drop();
 CREATE EVENT TRIGGER allow_whitelisted_extensions ON permissions_check
 (create_extension) EXECUTE PROCEDURE make_heroku_happy();
 CREATE EVENT TRIGGER replicate_at_subcommands ON subcommand_start
 (alter_table) EXECUTE PROCEDURE record_alter_table_subcommand();

I would amend that syntax to allow for a WHEN expr much like in the
DML trigger case, where the expression can play with the variables
exposed at the time of calling the trigger.

  create event trigger prohibit_some_ddl
 preceding timing spec
  when tag in ('CREATE TABLE', 'ALTER TABLE')
   execute procedure throw_an_error();

We must also take some time to describe the timing specs carefully,
their order of operation and which information are available for
triggers in each of them. See below.

I also don't like the way you squeeze in the drop cascade support here
by re qualifying it as a timing spec, which it's clearly not.

On subcommand_start, what is the tag set to? the main command or the
subcommand? if the former, where do you find the current subcommand tag?
I don't think subcommand_start should be a timing spec either.

 Piece by piece:

 - I prefer EVENT to COMMAND, because the start or end or any other
 point during the execution of a command can be viewed as an event;
 however, it is pretty clear that not everything that can be viewed as
 an event upon which we might like triggers to fire can be viewed as a
 command, even if we have a somewhat well-defined notion of
 subcommands.  I continue to think that there is no sense in which
 creating a sequence to back a serial column or dropping an object due
 to DROP .. CASCADE is a subcommand; it is, however, pretty clearly an
 event.  Anything we want can be an event.  I think if we don't get
 this right in the first version we're going to be stuck with a
 misnomer forever.  :-(

I can buy naming what we're building here EVENT TRIGGERs, albeit with
some caveats: what I implemented here only caters for commands for which
we have a distinct parse tree. That's what sub commands are in effect
and why it supports create schema sub commands but not all alter table
declination, and not cascaded drops.

Also, implementing cascade operations as calling process utility with a
complete parse tree is not going to be a little project on its own. We
can also punt that and document that triggers fired on cascaded drops
are not receiving a parse tree, only the command tag and object id and
object name and schema name, or even just the command tag and object id
in order not to incur too many additional name lookups.

Lastly, EVENT opens the road to thinking about transaction event
triggers, on begin|commit|rollback etc. Do we want to take that road?

 - It is pretty clear that for triggers that are supposed to fire at
 the start or end of a command, it's useful to be able to specify which
 commands it fires for.  However, there must be other types of events
 (as in my postulated sql_drop case above) where the name of the
 command is not relevant - people aren't going to find all the objects
 that got dropped as a result of a toplevel drop table command; they're
 going to want to find all the tables that got dropped regardless of
 which incantation did it.  Also keep in mind that you can do things
 like use ALTER TABLE to rename a view (and there are other cases of
 that sort of confusion with domains vs. types, aggregates vs.
 functions, etc), so being able to filter based on object type seems
 clearly better in a bunch of real-world cases.  And, even if you don't
 believe that specific 

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Robert Haas
On Thu, Mar 29, 2012 at 11:49 AM, Thom Brown t...@linux.com wrote:
 On 29 March 2012 16:30, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Thu, Mar 29, 2012 at 9:01 AM, Thom Brown t...@linux.com wrote:
 On 29 March 2012 13:30, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 I'll go make that happen, and still need input here. We first want to
 have command triggers on specific commands or ANY command, and we want
 to implement 3 places from where to fire them.

 Here's a new syntax proposal to cope with that:

 Is it necessary to add this complexity in this version?  Can't we keep
 it simple but in a way that allows the addition of this later?  The
 testing of all these new combinations sounds like a lot of work.

 I concur.  This is way more complicated than we should be trying to do
 in version 1.

 I'm at a loss here. This proposal was so that we can reach a commonly
 agreed minimal solution and design in first version. There's no new
 piece of infrastructure to add, the syntax is changed only to open the
 road for later, I'm not changing where the current command triggers are
 to be called (except for those which are misplaced).

 So, please help me here: what do we want to have in 9.3?

 Perhaps I misunderstood.  It was the addition of the fine-grained even
 options (parse, execute etc) I saw as new.  If you're saying those are
 just possible options for later that won't be in this version, I'm
 fine with that.  If those are to make it for 9.2, then creating the
 necessary test cases and possible fixes sounds infeasible in such a
 short space of time.  Please disregard if this is not the case.

So...

I've said repeatedly and over a long period of time that development
of this feature wasn't started early enough in the cycle to get it
finished in time for 9.2.  I think that I've identified some pretty
serious issues in the discussion we've had so far, especially (1) the
points at which figures are fired aren't consistent between commands,
(2) not much thought has been given to what happens if, say, a DDL
trigger performs a DDL operation on the table the outer DDL command is
due to modify, and (3) we are eventually going to want to trap a much
richer set of events than can be captured by the words before and
after.  Now, you could view this as me throwing up roadblocks to
validate my previously-expressed opinion that this wasn't going to get
done, but I really, honestly believe that these are important issues
and that getting them right is more important than getting something
done now.

If we still want to try to squeeze something into 9.2, I recommend
stripping out everything except for what Dimitri called the
before-any-command firing point.  In other words, add a way to run a
procedure after parsing of a command but before any name lookups have
been done, any permissions checks have been done, or any locks have
been taken.  The usefulness of such a hook is of course limited but it
is also a lot less invasive than the patch I recently reviewed and
probably a lot safer.  I actually think it's wise to do that as a
first step even if it doesn't make 9.2, because it is much easier to
build features like this incrementally and even a patch that does that
will be reasonably complicated and difficult to review.
Parenthetically, what Dimitri previously called the after-any-command
firing point, all the way at the end of the statement but without any
specific details about the object the statement operated on, seems
just as good for a first step, maybe better, so that would be a fine
foundation from my point of view as well.  The stuff that happens
somewhere in the middle, even just after locking and permissions
checking, is more complex and I think that should be phase 2
regardless of which phase ends up in which release cycle.

I'm not sure whether Dimitri's question about 9.3 was a typo for 9.2
or whether that's what he was actually asking about.  But to answer
that question for 9.3, I think we have time to build an extremely
extensive infrastructure that covers a huge variety of use cases, and
I think there is hardly any reasonable proposal that is off the table
as far as that goes.  We have a year to get that work done, and a year
is a long time, and with incremental progress at each CommitFest we
can do a huge amount.  I can also say that I would be willing to put
in some serious work during the 9.3 cycle to accelerate that progress,
too, especally if (ahem) some other people can return the favor by
reviewing my patches.  :-)  I could see us adding the functionality
described above in one CommitFest and then spending the next three
adding more whiz-bango frammishes and ending up with something really
nice.  Right now, though, we are very crunched for time, and probably
shouldn't be entertaining anything that requires a tenth the code
churn that this patch probably does; if we are going to do anything at
all, it had better be as simple and uninvasive as we 

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I've said repeatedly and over a long period of time that development
 of this feature wasn't started early enough in the cycle to get it
 finished in time for 9.2.  I think that I've identified some pretty

That could well be, yeah.

 serious issues in the discussion we've had so far, especially (1) the
 points at which figures are fired aren't consistent between commands,
 (2) not much thought has been given to what happens if, say, a DDL
 trigger performs a DDL operation on the table the outer DDL command is
 due to modify, and (3) we are eventually going to want to trap a much
 richer set of events than can be captured by the words before and
 after.  Now, you could view this as me throwing up roadblocks to
 validate my previously-expressed opinion that this wasn't going to get
 done, but I really, honestly believe that these are important issues
 and that getting them right is more important than getting something
 done now.

(1) is not hard to fix as soon as we set a policy, which the most
pressing thing we need to do in my mind, whatever we manage to
commit in 9.2.

(2) can be addressed with a simple blacklist that would be set just
before calling user defined code, and cleaned when done running it.
When in place the blacklist lookup is easy to implement in a central
place (utility.c or cmdtrigger.c) and ereport() when the current
command is in the blacklist.

e.g. alter table would blacklist alter table and drop table commands
on the current object id

(3) we need to continue designing that, yes. I think we can have a first
set of events defined now, even if we don't implement support for
all of them readily.

 If we still want to try to squeeze something into 9.2, I recommend
 stripping out everything except for what Dimitri called the
 before-any-command firing point.  In other words, add a way to run a

I would like that we can make that consistent rather than throw it, or
maybe salvage a part of the command we support here. It's easy enough if
boresome to document which commands are supported in which event timing.

 procedure after parsing of a command but before any name lookups have
 been done, any permissions checks have been done, or any locks have
 been taken.  The usefulness of such a hook is of course limited but it
 is also a lot less invasive than the patch I recently reviewed and
 probably a lot safer.  I actually think it's wise to do that as a
 first step even if it doesn't make 9.2, because it is much easier to
 build features like this incrementally and even a patch that does that
 will be reasonably complicated and difficult to review.

Yes.

 Parenthetically, what Dimitri previously called the after-any-command
 firing point, all the way at the end of the statement but without any
 specific details about the object the statement operated on, seems
 just as good for a first step, maybe better, so that would be a fine
 foundation from my point of view as well.  The stuff that happens

Now that fetching the information is implemented, I guess that we could
still provide for it when firing event trigger at that timing spec. Of
course that means a bigger patch to review when compared to not having
the feature, but Thom did spend loads of time to test this part of the
implementation.

 somewhere in the middle, even just after locking and permissions
 checking, is more complex and I think that should be phase 2
 regardless of which phase ends up in which release cycle.

Mmmm, ok.

 I'm not sure whether Dimitri's question about 9.3 was a typo for 9.2

Typo :)  I appreciate your offer, though :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Robert Haas
On Thu, Mar 29, 2012 at 12:17 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
     create command trigger before COMMAND_STEP of alter table
          execute procedure snitch();

 One thought is that it might be better to say AT or ON or WHEN rather
 than BEFORE, since BEFORE END is just a little strange; and also
 because a future hook point might be something like
 permissions-checking, and we really want it to be AT
 permissions-checking time, not BEFORE permissions-checking time.

 Yeah I tried different spellings and almost sent a version using AT or
 WHEN, but it appeared to me not to be specific enough: AT permission
 checking time does not exist, it either happens before or after
 permission checking. I played with using “preceding” rather than before,
 too, maybe you would like that better.

Well, preceding and before are synonyms, so I don't see any advantage
in that change.  But I really did mean AT permissions_checking time,
not before or after it.  That is, we'd have a hook where instead of
doing something like this:

aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_SELECT);

...we'd call a superuser-supplied trigger function and let it make the
call.  This requires a clear separation of permissions and integrity
checking that we are currently lacking, and probably quite a bit of
other engineering too, but I think we should assume that we're going
to do it at some point because there is it seems pretty clear that
there is a huge amount of pent-up demand for being able to do things
like this.

 If I got to pick, I'd pick this syntax:

 CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...])
 EXECUTE PROCEDURE function_name(args);

 Then you could eventually imagine:

 CREATE EVENT TRIGGER prohibit_all_ddl ON ddl_command_start EXECUTE
 PROCEDURE throw_an_error();
 CREATE EVENT TRIGGER prohibit_some_ddl ON ddl_command_start
 (alter_table) EXECUTE PROCEDURE throw_an_error();
 CREATE EVENT TRIGGER replicate_table_drops ON sql_drop (table) EXECUTE
 PROCEDURE record_table_drop();
 CREATE EVENT TRIGGER allow_whitelisted_extensions ON permissions_check
 (create_extension) EXECUTE PROCEDURE make_heroku_happy();
 CREATE EVENT TRIGGER replicate_at_subcommands ON subcommand_start
 (alter_table) EXECUTE PROCEDURE record_alter_table_subcommand();

 I would amend that syntax to allow for a WHEN expr much like in the
 DML trigger case, where the expression can play with the variables
 exposed at the time of calling the trigger.

  create event trigger prohibit_some_ddl
     preceding timing spec
          when tag in ('CREATE TABLE', 'ALTER TABLE')
       execute procedure throw_an_error();

We could do it that way, but the tag in ('CREATE TABLE', 'ALTER
TABLE') syntax will be tricky to evaluate.  I'd really prefer NOT to
need to start up and shut down the executor to figure out whether the
trigger has to fire.  If we are going to do it that way then we need
to carefully define what variables are available and what values they
get.  I think that most people's event-filtering needs will be simple
enough to be served by a more declarative syntax.

 We must also take some time to describe the timing specs carefully,
 their order of operation and which information are available for
 triggers in each of them. See below.

 I also don't like the way you squeeze in the drop cascade support here
 by re qualifying it as a timing spec, which it's clearly not.

 On subcommand_start, what is the tag set to? the main command or the
 subcommand? if the former, where do you find the current subcommand tag?
 I don't think subcommand_start should be a timing spec either.

That's up for grabs, but I was thinking the subcommand.  For example, consider:

alter table t alter column a add column b int, disable trigger all;

I would imagine this having a firing sequence something like this:

ddl_command_start:alter_table
ddl_command_permissions_check:alter_table
ddl_command_name_lookup:alter_table
alter_table_subcommand_prep:add_column
alter_table_subcommand_prep:disable_trigger_all
alter_table_subcommand_start:add_column
sql_create:column
alter_table_subcommand_end:add_column
alter_table_subcommand_start:disable_trigger_all
alter_table_subcommand_end:disable_trigger_all
ddl_command_end:alter_table

Lots of room for argument there, of course.

 Piece by piece:

 - I prefer EVENT to COMMAND, because the start or end or any other
 point during the execution of a command can be viewed as an event;
 however, it is pretty clear that not everything that can be viewed as
 an event upon which we might like triggers to fire can be viewed as a
 command, even if we have a somewhat well-defined notion of
 subcommands.  I continue to think that there is no sense in which
 creating a sequence to back a serial column or dropping an object due
 to DROP .. CASCADE is a subcommand; it is, however, pretty clearly an
 event.  Anything we want can be an event.  I think if we don't get
 

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Robert Haas
Apropos of nothing and since I haven't found a particularly good time
to say this in amidst all the technical discussion, I appreciate very
much all the work you've been putting into this.

On Thu, Mar 29, 2012 at 12:42 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 serious issues in the discussion we've had so far, especially (1) the
 points at which figures are fired aren't consistent between commands,
 (2) not much thought has been given to what happens if, say, a DDL
 trigger performs a DDL operation on the table the outer DDL command is
 due to modify, and (3) we are eventually going to want to trap a much
 richer set of events than can be captured by the words before and
 after.  Now, you could view this as me throwing up roadblocks to
 validate my previously-expressed opinion that this wasn't going to get
 done, but I really, honestly believe that these are important issues
 and that getting them right is more important than getting something
 done now.

 (1) is not hard to fix as soon as we set a policy, which the most
    pressing thing we need to do in my mind, whatever we manage to
    commit in 9.2.

I agree that setting a policy is enormously important, not so sure
about how easy it is to fix, but we shall see.

My primary and overriding goal here is to make sure that we don't make
any design decisions, in the syntax or otherwise, that foreclose the
ability to add policies that we've not yet dreamed of to future
versions of the code.  It seems vanishingly unlikely that the first
commit will cover all the use cases that people care about, and only
slightly more likely that we'll even be able to list them all at that
point.  Perhaps I'm wrong and we already know what they are, but I'd
like to bet against any of us - and certainly me - being that smart.

In terms of policies, there are two that seems to me to be very clear:
at the very beginning of ProcessUtility, and at the very end of it,
*excluding* recursive calls to ProcessUtility that are intended to
handle subcommands.  I think we could call the first start or
command_start or ddl_command_start or post-parse or pre-locking or
something along those lines, and the second end or command_end or
ddl_command_end.  I think it might be worth including ddl in there
because the scope of the current patch really is just DDL (as opposed
to say DCL) and having a separate set of firing points for DCL does
not seem dumb.  If we ever try to do anything with transaction control
as you suggested upthread that's likely to also require special
handling.  Calling it, specifically, ddl_command_start makes the scope
very clear.

 (2) can be addressed with a simple blacklist that would be set just
    before calling user defined code, and cleaned when done running it.
    When in place the blacklist lookup is easy to implement in a central
    place (utility.c or cmdtrigger.c) and ereport() when the current
    command is in the blacklist.

    e.g. alter table would blacklist alter table and drop table commands
    on the current object id

Here we're talking about a firing point that we might call
ddl_post_name_lookup or something along those lines.  I would prefer
to handle this by making the following rules - here's I'm assuming
that we're talking about the case where the object in question is a
relation:

1. The trigger fires immediately after RangeVarGetRelidExtended and
before any other checks are performed, and especially before any
CheckTableNotInUse(). This last is important, because what matters is
whether the table's not in use AFTER all possible user-supplied code
is executed.  If the trigger opens or closes cursors, for example,
what matters is whether there are any cursors left open *after* the
triggers complete, not whether there were any open on entering the
trigger.  The same is true of any other integrity check: if there's a
chance that the trigger could change something that affects whether
the integrity constraint gets fired, then we'd better be darn sure to
fire the trigger before we check the integrity constraint.

2. If we fire any triggers at this point, we CommandCounterIncrement()
and then re-do RangeVarGetRelidExtended() with the same arguments we
passed before.  If it doesn't return the same OID we got the first
time, we abort with some kind of serialization error.  We need to be a
little careful about the phrasing of the error message, because it's
possible for this to change during command execution even without
command triggers if somebody creates a table earlier in the search
path with the same unqualified name that was passed to the command,
but I think it's OK for the existence of a command-trigger to cause a
spurious abort in that very narrow case, as long as the error message
includes some appropriate weasel language.  Alternatively, we could
try to avoid that corner case by rechecking only that a tuple with the
right OID is still present and still has the correct relkind, but that
seems like it might be a little less 

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Well, preceding and before are synonyms, so I don't see any advantage
 in that change.  But I really did mean AT permissions_checking time,
 not before or after it.  That is, we'd have a hook where instead of
 doing something like this:

 aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_SELECT);

 ...we'd call a superuser-supplied trigger function and let it make the

Wow. I don't think I like that at all. It's indeed powerful, but how do
you go explaining and fixing unanticipated behavior with such things in
production? It looks too much like an invitation to break a very careful
design where each facility has to rove itself to get in.

So yes, that's not at all what I was envisioning, I still think that
most event specs for event triggers are going to have to happen in
between our different internal implementation of given facility.

Maybe we should even use BEFORE in cases I had in mind and AT for cases
like the one you're picturing?

 CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...])
 EXECUTE PROCEDURE function_name(args);

 I would amend that syntax to allow for a WHEN expr much like in the
 DML trigger case, where the expression can play with the variables
 exposed at the time of calling the trigger.

  create event trigger prohibit_some_ddl
     preceding timing spec
          when tag in ('CREATE TABLE', 'ALTER TABLE')
       execute procedure throw_an_error();

 We could do it that way, but the tag in ('CREATE TABLE', 'ALTER
 TABLE') syntax will be tricky to evaluate.  I'd really prefer NOT to
 need to start up and shut down the executor to figure out whether the
 trigger has to fire.  If we are going to do it that way then we need
 to carefully define what variables are available and what values they
 get.  I think that most people's event-filtering needs will be simple
 enough to be served by a more declarative syntax.

We could then maybe restrict that idea so that the syntax is more like

  when trigger variable in (literal[, ...])

So that we just have to store the array of strings and support only one
operation there. We might want to go as far as special casing the object
id to store an oid vector there rather than a text array, but we could
also decide not to support per-oid command triggers in the first
release and remove that from the list of accepted trigger variables.

 I also don't like the way you squeeze in the drop cascade support here
 by re qualifying it as a timing spec, which it's clearly not.

 On subcommand_start, what is the tag set to? the main command or the
 subcommand? if the former, where do you find the current subcommand tag?
 I don't think subcommand_start should be a timing spec either.

 That's up for grabs, but I was thinking the subcommand.  For example, 
 consider:

 alter table t alter column a add column b int, disable trigger all;

 I would imagine this having a firing sequence something like this:

 ddl_command_start:alter_table
 ddl_command_permissions_check:alter_table
 ddl_command_name_lookup:alter_table
 alter_table_subcommand_prep:add_column
 alter_table_subcommand_prep:disable_trigger_all
 alter_table_subcommand_start:add_column
 sql_create:column
 alter_table_subcommand_end:add_column
 alter_table_subcommand_start:disable_trigger_all
 alter_table_subcommand_end:disable_trigger_all
 ddl_command_end:alter_table

 Lots of room for argument there, of course.

Yeah well, in my mind the alter table actions are not subcommands yet
because they don't go back to standard_ProcessUtility(). The commands in
CREATE SCHEMA do that, same for create table foo(id serial primary key)
and its sequence and index.

We could maybe add another variable called context in the command
trigger procedure API that would generally be NULL and would be set to
the main command tag if we're running a subcommand.

We could still support alter table commands as sub commands as far as
the event triggers are concerned, and document that you don't get a
parse tree there even when implementing the trigger in C.

 Yeah, I would not try to pass the parse tree to every event trigger,
 just the ones where it's well-defined.  For drops, I would just say,
 hey, this is what we dropped.  The user can log it or ereport, but
 that's about it.  Still, that's clearly enough to do a lot of really
 interesting stuff, replication being the most obvious example.

Yes the use case is important enough to want to support it. Now, to
square it into the design. I think it should fire as a “normal” event
trigger, with the context set to e.g. DROP TYPE, the tag set to this
object classid (e.g. DROP FUNCTION).

It's easy enough to implement given that CreateCommandTag() already
switch (((DropStmt *) parsetree)-removeType), we just need to move that
code in another function and reuse it at the right places. It's only
missing DROP COLUMN apparently, I would have to add that.

So you would know that the trigger is firing for a DROP FUNCTION that

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Apropos of nothing and since I haven't found a particularly good time
 to say this in amidst all the technical discussion, I appreciate very
 much all the work you've been putting into this.

Hey, thanks, I very much appreciate your support here!

 (1) is not hard to fix as soon as we set a policy, which the most
    pressing thing we need to do in my mind, whatever we manage to
    commit in 9.2.

 I agree that setting a policy is enormously important, not so sure
 about how easy it is to fix, but we shall see.

I think your proposal idea on the table is fixing it (the new event
timing specs), it's all about fine tuning it now.

 My primary and overriding goal here is to make sure that we don't make
 any design decisions, in the syntax or otherwise, that foreclose the
 ability to add policies that we've not yet dreamed of to future

Yeah we're not hard coding the list of possible event timings in the
syntax.

 In terms of policies, there are two that seems to me to be very clear:
 at the very beginning of ProcessUtility, and at the very end of it,
 *excluding* recursive calls to ProcessUtility that are intended to
 handle subcommands.  I think we could call the first start or
 command_start or ddl_command_start or post-parse or pre-locking or
 something along those lines, and the second end or command_end or
 ddl_command_end.  I think it might be worth including ddl in there
 because the scope of the current patch really is just DDL (as opposed
 to say DCL) and having a separate set of firing points for DCL does
 not seem dumb.  If we ever try to do anything with transaction control
 as you suggested upthread that's likely to also require special
 handling.  Calling it, specifically, ddl_command_start makes the scope
 very clear.

I'm not sure about that really. First, the only reason why DCL command
are not supported in the current patch is because roles are global
objects and we don't want role related behavior to depend on which
database you're currently connected to.

Then, I guess any utility command implementation will share the same
principle of having a series of step and allowing a user defined
function to get called in between some of them. So adding support for
commands that won't fit in the DDL timing specs we're trying to design
now amounts to adding timing specs for those commands.

I'm not sold that the timing specs should contain ddl or dcl or other
things, really, I find that to be ugly, but I won't fight over that.

 (2) can be addressed with a simple blacklist that would be set just
    before calling user defined code, and cleaned when done running it.
    When in place the blacklist lookup is easy to implement in a central
    place (utility.c or cmdtrigger.c) and ereport() when the current
    command is in the blacklist.

    e.g. alter table would blacklist alter table and drop table commands
    on the current object id

 Here we're talking about a firing point that we might call
 ddl_post_name_lookup or something along those lines.  I would prefer
 to handle this by making the following rules - here's I'm assuming
 that we're talking about the case where the object in question is a
 relation:

 1. The trigger fires immediately after RangeVarGetRelidExtended and
 before any other checks are performed, and especially before any
 CheckTableNotInUse(). This last is important, because what matters is
 whether the table's not in use AFTER all possible user-supplied code
 is executed.  If the trigger opens or closes cursors, for example,
 what matters is whether there are any cursors left open *after* the
 triggers complete, not whether there were any open on entering the
 trigger.  The same is true of any other integrity check: if there's a
 chance that the trigger could change something that affects whether
 the integrity constraint gets fired, then we'd better be darn sure to
 fire the trigger before we check the integrity constraint.

Ack.

 2. If we fire any triggers at this point, we CommandCounterIncrement()
 and then re-do RangeVarGetRelidExtended() with the same arguments we
 passed before.  If it doesn't return the same OID we got the first
 time, we abort with some kind of serialization error.  We need to be a
 little careful about the phrasing of the error message, because it's
 possible for this to change during command execution even without
 command triggers if somebody creates a table earlier in the search
 path with the same unqualified name that was passed to the command,
 but I think it's OK for the existence of a command-trigger to cause a
 spurious abort in that very narrow case, as long as the error message
 includes some appropriate weasel language.  Alternatively, we could
 try to avoid that corner case by rechecking only that a tuple with the
 right OID is still present and still has the correct relkind, but that
 seems like it might be a little less bullet-proof for no real
 functional gain.

I can buy having a 

Re: [HACKERS] Command Triggers patch v18

2012-03-29 Thread Robert Haas
On Thu, Mar 29, 2012 at 4:21 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 Well, preceding and before are synonyms, so I don't see any advantage
 in that change.  But I really did mean AT permissions_checking time,
 not before or after it.  That is, we'd have a hook where instead of
 doing something like this:

 aclresult = pg_class_aclcheck(reloid, GetUserId(), ACL_SELECT);

 ...we'd call a superuser-supplied trigger function and let it make the

 Wow. I don't think I like that at all. It's indeed powerful, but how do
 you go explaining and fixing unanticipated behavior with such things in
 production? It looks too much like an invitation to break a very careful
 design where each facility has to rove itself to get in.

I'm thinking of things like extension whitelisting.  When some
unprivileged user says CREATE EXTENSION harmless, and harmless is
marked as superuser-only, we might like to have a hook that gets
called *at permissions-checking time* and gets to say, oh, well, that
extension is on the white-list, so we're going to allow it.  I think
you can come up with similar cases for other commands, where in
general the operation is restricted to superusers or database owners
or table owners but in specific cases you want to allow others to do
it.

 CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...])
 EXECUTE PROCEDURE function_name(args);

 I would amend that syntax to allow for a WHEN expr much like in the
 DML trigger case, where the expression can play with the variables
 exposed at the time of calling the trigger.

  create event trigger prohibit_some_ddl
     preceding timing spec
          when tag in ('CREATE TABLE', 'ALTER TABLE')
       execute procedure throw_an_error();

 We could do it that way, but the tag in ('CREATE TABLE', 'ALTER
 TABLE') syntax will be tricky to evaluate.  I'd really prefer NOT to
 need to start up and shut down the executor to figure out whether the
 trigger has to fire.  If we are going to do it that way then we need
 to carefully define what variables are available and what values they
 get.  I think that most people's event-filtering needs will be simple
 enough to be served by a more declarative syntax.

 We could then maybe restrict that idea so that the syntax is more like

  when trigger variable in (literal[, ...])

 So that we just have to store the array of strings and support only one
 operation there. We might want to go as far as special casing the object
 id to store an oid vector there rather than a text array, but we could
 also decide not to support per-oid command triggers in the first
 release and remove that from the list of accepted trigger variables.

I guess that would make sense if you think there would ever be more
than one choice for trigger variable.  I'm not immediately seeing a
use case for that, though - I was explicitly viewing the syntax foo
(bar, baz) to mean foo when
the-only-trigger-variable-that-makes-sense-given-that-we-are-talking-about-a-trigger-on-foo
in (bar, baz).

More generally, my thought on the structure of this is that you're
going to have certain toplevel events, many of which will happen at
only a single place in the code, like an object got dropped or a
DDL command started or a DDL command ended.  So we give those
names, like sql_drop, ddl_command_start, and ddl_command_end.  Inside
your trigger procedure, the set of magic variables that is available
will depend on which toplevel event you set the trigger on, but
hopefully all firings of that toplevel event can provide the same
magic variables.  For example, at ddl_command_start time, you're just
gonna get the command tag, but at ddl_command_end time you will get
the command tag plus maybe some other stuff.

Now, we COULD stop there.  I mean, we could document that you can
create a trigger on ddl_command_start and every DDL command will fire
that trigger, and if the trigger doesn't care about some DDL
operations, then it can look at the command tag and return without
doing anything for the operations it doesn't care about.  The only
real disadvantage of doing it that way is speed, and maybe a bit of
code complexity within the trigger.  So my further thought was that
we'd allow you to specify an optional filter list to restrict which
events would fire the trigger, and the exact meaning of that filter
list would vary depending on the toplevel event you've chosen - i.e.
for ddl_command_start, the filter would be a list of commands, but for
sql_drop it would be a list of object types.  We could make that more
complicated if we think that an individual toplevel event will need
more than one kind of filtering.  For example, if you wanted to filter
sql_drop events based on the object type AND/OR the schema name, then
the syntax I proposed would obviously not be adequate.  I'm just not
convinced we need that, especially because you'd then need to set up a
dependency between the command trigger and the schema, 

Re: [HACKERS] Command Triggers patch v18

2012-03-28 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I think BEFORE command triggers ideally should run
 * before permission checks
 * before locking
 * before internal checks are done (nameing conflicts, type checks and such)

 It is possible to do this, and it would actually be much easier and
 less invasive to implement than what Dimitri has done here, but the
 downside is that you won't have done the name lookup either.

There's a trade-off decision to take here, that was different in
previous versions of the patch. You can either run the trigger very
early or have specific information details.

The way to have both and keep your sanity, and that was implemented in
the patch, is to have ANY command triggers run before the process
utility big switch and provide only the command tag and parse tree, and
have the specific triggers called after permission, locking and internal
checks are done.

I've been asked to have a single call site for ANY and specific
triggers, which means you can't have BEFORE triggers running either
before or after those steps.

Now I can see why implementing that on top of the ANY command feature is
surprising enough for wanting to not do it this way. Maybe the answer is
to use another keyword to be able to register command triggers that run
that early and without any specific information other than the command
tag.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-28 Thread Robert Haas
On Wed, Mar 28, 2012 at 4:12 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 I think BEFORE command triggers ideally should run
 * before permission checks
 * before locking
 * before internal checks are done (nameing conflicts, type checks and such)

 It is possible to do this, and it would actually be much easier and
 less invasive to implement than what Dimitri has done here, but the
 downside is that you won't have done the name lookup either.

 There's a trade-off decision to take here, that was different in
 previous versions of the patch. You can either run the trigger very
 early or have specific information details.

 The way to have both and keep your sanity, and that was implemented in
 the patch, is to have ANY command triggers run before the process
 utility big switch and provide only the command tag and parse tree, and
 have the specific triggers called after permission, locking and internal
 checks are done.

 I've been asked to have a single call site for ANY and specific
 triggers, which means you can't have BEFORE triggers running either
 before or after those steps.

 Now I can see why implementing that on top of the ANY command feature is
 surprising enough for wanting to not do it this way. Maybe the answer is
 to use another keyword to be able to register command triggers that run
 that early and without any specific information other than the command
 tag.

Yeah, I think so.  I objected to the way you had it because of the
inconsistency, not because I think it's a useless place to fire
triggers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-28 Thread Robert Haas
On Wed, Mar 28, 2012 at 7:16 AM, Robert Haas robertmh...@gmail.com wrote:
 Now I can see why implementing that on top of the ANY command feature is
 surprising enough for wanting to not do it this way. Maybe the answer is
 to use another keyword to be able to register command triggers that run
 that early and without any specific information other than the command
 tag.

 Yeah, I think so.  I objected to the way you had it because of the
 inconsistency, not because I think it's a useless place to fire
 triggers.

Further thought: I think maybe we shouldn't use keywords at all for
this, and instead use descriptive strings like post-parse or
pre-execution or command-start, because I bet in the end we're going
to end up with a bunch of them (create-schema-subcommand-start?
alter-table-subcommand-start?), and it would be nice not to hassle
with the grammar every time we want to add a new one.  To make this
work with the grammar, we could either (1) enforce that each is
exactly one word or (2) require them to be quoted or (3) require those
that are not a single word to be quoted.  I tend think #2 might be the
best option in this case, but...

I've also had another general thought about safety: we need to make
sure that we're only firing triggers from places where it's safe for
user code to perform arbitrary actions.  In particular, we have to
assume that any triggers we invoke will AcceptInvalidationMessages()
and CommandCounterIncrement(); and we probably can't do it at all (or
at least not without some additional safeguard) in places where the
code does CheckTableNotInUse() up through the point where it's once
again safe to access the table, since the trigger might try.  We also
need to think about re-entrancy: that is, in theory, the trigger might
execute any other DDL command - e.g. it might drop the table that
we've been asked to rename.  I suspect that some of the current BEFORE
trigger stuff might fall down on that last one, since the existing
code not-unreasonably expects that once it's locked the table, the
catalog entries can't go away.  Maybe it all happens to work out OK,
but I don't think we can count on that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-28 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Further thought: I think maybe we shouldn't use keywords at all for
 this, and instead use descriptive strings like post-parse or
 pre-execution or command-start, because I bet in the end we're going
 to end up with a bunch of them (create-schema-subcommand-start?
 alter-table-subcommand-start?), and it would be nice not to hassle
 with the grammar every time we want to add a new one.  To make this
 work with the grammar, we could either (1) enforce that each is
 exactly one word or (2) require them to be quoted or (3) require those
 that are not a single word to be quoted.  I tend think #2 might be the
 best option in this case, but...

What about :

  create command trigger foo before prepare alter table …
  create command trigger foo before start of alter table …
  create command trigger foo before execute alter table …
  create command trigger foo before end of alter table …

  create command trigger foo when prepare alter table …
  create command trigger foo when start alter table …
  create command trigger foo when execute of alter table …
  create command trigger foo when end of alter table …

  create command trigger foo preceding alter table …
  create command trigger foo preceding alter table … deferred
  create command trigger foo preceding alter table … immediate

  create command trigger foo following parser of alter table …
  create command trigger foo preceding execute of alter table …

  create command trigger foo following alter table …

  create command trigger foo before alter table nowait …

I'm not sure how many hooks we can really export here, but I guess we
have enough existing keywords to describe when they get to run (parser,
mapping, lock, check, begin, analyze, access, disable, not exists, do,
end, escape, extract, fetch, following, search…)

 I've also had another general thought about safety: we need to make
 sure that we're only firing triggers from places where it's safe for
 user code to perform arbitrary actions.  In particular, we have to
 assume that any triggers we invoke will AcceptInvalidationMessages()
 and CommandCounterIncrement(); and we probably can't do it at all (or
 at least not without some additional safeguard) in places where the
 code does CheckTableNotInUse() up through the point where it's once
 again safe to access the table, since the trigger might try.  We also

I've been trying to do that already.

 need to think about re-entrancy: that is, in theory, the trigger might
 execute any other DDL command - e.g. it might drop the table that
 we've been asked to rename.  I suspect that some of the current BEFORE

That's why I implemented ALTER COMMAND TRIGGER ... SET DISABLE in the
first place, so that you could run the same command from the trigger
itself without infinite recursion.

 trigger stuff might fall down on that last one, since the existing
 code not-unreasonably expects that once it's locked the table, the
 catalog entries can't go away.  Maybe it all happens to work out OK,
 but I don't think we can count on that.

I didn't think of DROP TABLE in a command table running BEFORE ALTER
TABLE, say. That looks evil. It could be documented as yet another way
to shoot yourself in the foot though?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Dimitri Fontaine
Hi,

First things first, thanks for the review!

I'm working on a new version of the patch to fix all the specific
comments you called nitpicking here and in your previous email. This
new patch will also implement a single list of triggers to run in
alphabetical order, not split by ANY/specific command.

Robert Haas robertmh...@gmail.com writes:
 I am extremely concerned about the way in which this patch arranges to
 invoke command triggers.  You've got call sites spattered all over the
 place, and I think that's going to be, basically, an unfixable mess
 and a breeding ground for bugs.  On a first read-through:

In the first versions of the patch I did try to have a single point
where to integrate the feature and that didn't work out. If you want to
publish information such as object id, name and schema you need to have
specialized code spread out everywhere.

Then about where to call the trigger, it's a per-command decision with a
general policy. Your comments here are of two kinds, mostly about having
to guess the policy because it's not explicit, and some specifics that
either are in question or not following up on the policy.

The policy I've been willing to install is that command triggers should
get fired once the basic error checking is done. That's not perfect for
auditing purposes *if you want to log all attempts* but it's good enough
to audit all commands that had an impact on your system, and you still
can block commands.

Also, in most commands you can't get the object id and name before the
basic error checking is done.

Now, about the IF NOT EXISTS case, with the policy just described
there's no reason to trigger user code, but I can also see your position
here.

 1. BEFORE ALTER TABLE triggers are fired in AlterTable().  However, an

I used to have that in utility.c but Andres had me move that out, and it
seems like I should get back to utility.c for the reasons you're
mentioning and that I missed.

 2. BEFORE CREATE TABLE triggers seem to have similar issues; see
 transformCreateStmt.

 rhaas=# create table foo (a serial primary key);
 NOTICE:  CREATE TABLE will create implicit sequence foo_a_seq for
 serial column foo.a
 NOTICE:  snitch: BEFORE CREATE SEQUENCE public.foo_a_seq [NULL]
 NOTICE:  snitch: AFTER CREATE SEQUENCE public.foo_a_seq [16392]
 NOTICE:  snitch: BEFORE CREATE TABLE public.foo [NULL]
 NOTICE:  snitch: BEFORE CREATE INDEX public.foo_pkey [NULL]
 NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
 foo_pkey for table foo
 NOTICE:  snitch: AFTER CREATE INDEX public.foo_pkey [16398]
 NOTICE:  snitch: BEFORE ALTER SEQUENCE public.foo_a_seq [16392]
 NOTICE:  snitch: AFTER ALTER SEQUENCE public.foo_a_seq [16392]
 NOTICE:  snitch: AFTER CREATE TABLE public.foo [16394]
 CREATE TABLE

That's meant to be a feature of the command trigger system, that's been
asked about by a lot of people. Having triggers fire for sub commands is
possibly The Right Thing™, or at least the most popular one.

 3. RemoveRelations() and RemoveObjects() similarly take the position
 that when the object does not exist, command triggers need not fire.
 This seems entirely arbitrary.  CREATE EXTENSION IF NOT EXISTS,
 however, takes the opposite (and probably correct) position that even
 if we decide not to create the extension, we should still fire command
 triggers.  In a similar vein, AlterFunctionOwner_oid() skips firing
 the command triggers if the old and new owners happen to be the same,
 but other forms of ALTER FUNCTION (e.g. ALTER FUNCTION .. COST) fire
 triggers even if the old and new parameters are the same; and
 AlterForeignDataWrapperOwner_internal does NOT skip firing command
 triggers just because the old and new owners are the same.

We integrate here with the code as it stands, I didn't question the
error management already existing. Do we need to revise that in the
scope of the command triggers patch?

 4. RemoveRelations() and RemoveObjects() also take the position that a
 statement like DROP TABLE foo, bar should fire each relevant BEFORE
 command trigger twice, then drop both objects, then fire each relevant
 AFTER command trigger twice.  I think that's wrong.  It's 100% clear

See above, it's what users are asking us to implement as a feature.

 5. It seems desirable for BEFORE command triggers to fire at a
 consistent point during command execution, but currently they don't.

The policy should be to fire the triggers once the usual error checking
is done.

 For example, BEFORE DROP VIEW triggers don't fire until we've verified
 that q exists, is a view, and that we have permission to drop it, but
 LOAD triggers fire much earlier, before we've really checked anything
 at all.  And ALTER TABLE is somewhere in the middle: we fire the
 BEFORE trigger after checking permissions on the main table, but
 before all permissions checks are done, viz:

I will rework LOAD, and ALTER TABLE too, which is more work as you can
imagine, because now we have to insinuate the command trigger 

Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Robert Haas
On Tue, Mar 27, 2012 at 4:27 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 In the first versions of the patch I did try to have a single point
 where to integrate the feature and that didn't work out. If you want to
 publish information such as object id, name and schema you need to have
 specialized code spread out everywhere.

[...]

 That's meant to be a feature of the command trigger system, that's been
 asked about by a lot of people. Having triggers fire for sub commands is
 possibly The Right Thing™, or at least the most popular one.

[...]

 I will rework LOAD, and ALTER TABLE too, which is more work as you can
 imagine, because now we have to insinuate the command trigger calling
 into each branch of what ALTER TABLE is able to implement.

[...]

 From last year's cluster hacker meeting then several mails on this list,
 it appears that we don't want to do it the way you propose, which indeed
 would be simpler to implement.

[...]

 I think that's a feature. You skip calling the commands trigger when you
 know you won't run the command at all.

I am coming more and more strongly to the conclusion that we're going
in the wrong direction here.  It seems to me that you're spending an
enormous amount of energy implementing something that goes by the name
COMMAND TRIGGER when what you really want is an EVENT TRIGGER.
Apparently, you don't want a callback every time someone types CREATE
TABLE: you want a callback every time a table gets created.  You don't
want a callback every time someone types DROP FUNCTION; you want a
callback every time a function gets dropped.  If the goal here is to
do replication, you'd more than likely even want such callbacks to
occur when the function is dropped indirectly via CASCADE.  In the
interest of getting event triggers, you're arguing that we should
contort the definition of the work command to include sub-commands,
but not necessarily commands that turn out to be a no-op, and maybe
things that are sort of like what commands do even though nobody
actually ever executed a command by that name.  I just don't think
that's a good idea.  We either have triggers on commands, and they
execute once per command, period; or we have triggers on events and
they execute every time that event happens.

As it turns out, two people have already put quite a bit of work into
designing and implementing what amounts to an event trigger system for
PostgreSQL: me, and KaiGai Kohei.  It's called the ObjectAccessHook
mechanism, and it fires every time we create or drop an object.  It
passes the type of object created or dropped, the OID of the object,
and the column number also in the case of a column.  The major
drawback of this mechanism is that you have to write the code you want
to execute in C, and you can't arrange for it to be executed via a DDL
command; instead, you have to set a global variable from your shared
library's _PG_init() function.  However, I don't think that would be
very hard to fix.  We could simply replaces the
InvokeObjectAccessHook() macro with a function that calls a list of
triggers pulled from the catalog.

I think there are a couple of advantages of this approach over what
you've got right now.  First, there are a bunch of tested hook points
that are already committed.  They have well-defined semantics that are
easy to understand: every time we create or drop an object, you get a
callback with these arguments.  Second, KaiGai Kohei is interested in
adding more hook points in the future to service sepgsql.  If the
command triggers code and the sepgsql code both use the same set of
hook points then (1) we won't clutter the code with multiple sets of
hook points and (2) any features that get added for purposes of
command triggers will also benefit sepgsql, and visca versa.  Since
the two of you are trying to solve very similar problems, it would be
nice if both of you were pulling in the same direction.  Third, and
most importantly, it seems to match the semantics you want, which is a
callback every time something *happens* rather than a callback every
time someone *types something*.

Specifically, I propose the following plan:

- Rename COMMAND TRIGGER to EVENT TRIGGER.  Rewrite the documentation
to say that we're going to fire triggers every time an *event*
happens.  Rewrite the code to put the firing mechanism inside
InvokeObjectAccessHook, which will become a function rather than a
macro.

- Change the list of supported trigger types to match what the
ObjectAccessHook mechanism understands, which means, at present,
post-create and drop.  It might even make sense to forget about having
separate hooks for every time of object that can be created or dropped
and instead just let people say CREATE EVENT TRIGGER name ON { CREATE
| DROP } EXECUTE PROCEDURE function_name ( arguments ).

- Once that's done, start adding object-access-hook invocations (and
thus, the corresponding command trigger functionality) to whatever
other operations you'd like to have 

Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Andres Freund
On Tuesday, March 27, 2012 02:55:47 PM Robert Haas wrote:
 On Tue, Mar 27, 2012 at 4:27 AM, Dimitri Fontaine
 
 dimi...@2ndquadrant.fr wrote:
  In the first versions of the patch I did try to have a single point
  where to integrate the feature and that didn't work out. If you want to
  publish information such as object id, name and schema you need to have
  specialized code spread out everywhere.
 
 [...]
 
  That's meant to be a feature of the command trigger system, that's been
  asked about by a lot of people. Having triggers fire for sub commands is
  possibly The Right Thing™, or at least the most popular one.
 
 [...]
 
  I will rework LOAD, and ALTER TABLE too, which is more work as you can
  imagine, because now we have to insinuate the command trigger calling
  into each branch of what ALTER TABLE is able to implement.
 
 [...]
 
  From last year's cluster hacker meeting then several mails on this list,
  it appears that we don't want to do it the way you propose, which indeed
  would be simpler to implement.
 
 [...]
 
  I think that's a feature. You skip calling the commands trigger when you
  know you won't run the command at all.
 
 I am coming more and more strongly to the conclusion that we're going
 in the wrong direction here.  It seems to me that you're spending an
 enormous amount of energy implementing something that goes by the name
 COMMAND TRIGGER when what you really want is an EVENT TRIGGER.
 Apparently, you don't want a callback every time someone types CREATE
 TABLE: you want a callback every time a table gets created.
Not necessarily. One use-case is doing something *before* something happens 
like enforcing naming conventions, data types et al during development/schema 
migration.

 In the
 interest of getting event triggers, you're arguing that we should
 contort the definition of the work command to include sub-commands,
 but not necessarily commands that turn out to be a no-op, and maybe
 things that are sort of like what commands do even though nobody
 actually ever executed a command by that name.  I just don't think
 that's a good idea.  We either have triggers on commands, and they
 execute once per command, period; or we have triggers on events and
 they execute every time that event happens.
I don't think thats a very helpfull definition. What on earth would you want to 
do with plain passing of
CREATE SCHEMA blub CREATE TABLE foo() CREATE TABLE bar();

So some decomposition seems to be necessary. Getting the level right sure 
ain't totally easy.
It might be helpful to pass in the context from which something like that 
happened.

 As it turns out, two people have already put quite a bit of work into
 designing and implementing what amounts to an event trigger system for
 PostgreSQL: me, and KaiGai Kohei.  It's called the ObjectAccessHook
 mechanism, and it fires every time we create or drop an object.  It
 passes the type of object created or dropped, the OID of the object,
 and the column number also in the case of a column.  The major
 drawback of this mechanism is that you have to write the code you want
 to execute in C, and you can't arrange for it to be executed via a DDL
 command; instead, you have to set a global variable from your shared
 library's _PG_init() function.  However, I don't think that would be
 very hard to fix.  We could simply replaces the
 InvokeObjectAccessHook() macro with a function that calls a list of
 triggers pulled from the catalog.
Which would basically require including pg_dump in the backend to implement 
replication and logging. I don't think librarifying pg_dump is a fair burden 
at all.
Also I have a *very hard* time to imagining to sensibly implement replicating 
CREATE TABLE or ALTER TABLE ... ADD COLUMN with just object access hooks. 
There is just not enough context. How would you discern between a ADD COLUMN 
and a CREATE TABLE? They look very similar or even identical on a catalog 
level.

I also have the strong feeling that all this would expose implementation 
details *at least* as much as command triggers. A slight change in order of 
catalog modifcation would be *way* harder to hide via the object hook than 
something of a similar scale via command triggers.

 I think there are a couple of advantages of this approach over what
 you've got right now.  First, there are a bunch of tested hook points
 that are already committed.  They have well-defined semantics that are
 easy to understand: every time we create or drop an object, you get a
 callback with these arguments.  Second, KaiGai Kohei is interested in
 adding more hook points in the future to service sepgsql.  If the
 command triggers code and the sepgsql code both use the same set of
 hook points then (1) we won't clutter the code with multiple sets of
 hook points and (2) any features that get added for purposes of
 command triggers will also benefit sepgsql, and visca versa.  Since
 the two of you are trying to solve very similar problems, it would be
 nice 

Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Robert Haas
On Tue, Mar 27, 2012 at 9:37 AM, Andres Freund and...@2ndquadrant.com wrote:
 Not necessarily. One use-case is doing something *before* something happens
 like enforcing naming conventions, data types et al during development/schema
 migration.

That is definitely a valid use case.  The only reason why we don't
have something like that built into the ObjectAccessHook framework is
because we haven't yet figured out a clean way to make it work.   Most
of KaiGai's previous attempts involved passing a bunch of garbage
selected apparently at random into the hook function, and I rejected
that as unmaintainable.  Dimitri's code here doesn't have that problem
- it passes in a consistent set of information across the board.  But
I still think it's unmaintainable, because there's no consistency
about when triggers get invoked, or whether they get invoked at all.
We need something that combines the strengths of both approaches.

I actually think that, to really meet all needs here, we may need the
ability to get control in more than one place.  For example, one thing
that KaiGai has wanted to do (and I bet Dimitri would live to be able
to do it too, and I'm almost sure that Dan Farina would like to do it)
is override the built-in security policy for particular commands.  I
think that KaiGai only wants to be able to deny things that would
normally be allowed, but I suspect some of those other folks would
also like to be able to allow things that would normally be denied.
For those use cases, you want to get control at permissions-checking
time.  However, where Dimitri has the hooks right now, BEFORE triggers
for ALTER commands fire after you've already looked up the object that
you're manipulating.  That has the advantage of allowing you to use
the OID of the object, rather than its name, to make policy decisions;
but by that time it's too late to cancel a denial-of-access by the
first-line permissions checks.  Dimitri also mentioned wanting to be
able to cancel the actual operation - and presumably, do something
else instead, like go execute it on a different node, and I think
that's another valid use case for this kind of trigger.  It's going to
take some work, though, to figure out what the right set of control
points is, and it's probably going to require some refactoring of the
existing code, which is a mess that I have been slowly trying to clean
up.

 In the
 interest of getting event triggers, you're arguing that we should
 contort the definition of the work command to include sub-commands,
 but not necessarily commands that turn out to be a no-op, and maybe
 things that are sort of like what commands do even though nobody
 actually ever executed a command by that name.  I just don't think
 that's a good idea.  We either have triggers on commands, and they
 execute once per command, period; or we have triggers on events and
 they execute every time that event happens.
 I don't think thats a very helpfull definition. What on earth would you want 
 to
 do with plain passing of
 CREATE SCHEMA blub CREATE TABLE foo() CREATE TABLE bar();

 So some decomposition seems to be necessary. Getting the level right sure
 ain't totally easy.
 It might be helpful to pass in the context from which something like that
 happened.

I agree that it's not a very helpful decision, but I'm not the one who
said we wanted command triggers rather than event triggers.  :-)

 Which would basically require including pg_dump in the backend to implement
 replication and logging. I don't think librarifying pg_dump is a fair burden
 at all.

I don't either, but that strikes me as a largely unrelated problem.
As you may recall, I've complained that too much of that functionality
is in the client in the past, and I haven't changed my opinion on
that.  But how is that any different with Dimitri's approach?  You can
get a callback AFTER CREATE TABLE, and you'll get the table name.  Now
what?  If you get the trigger in C you can get the node tree, but
that's hardly any better.  You're still going to need to do some
pretty tricky push-ups to get reliable replication.  It's not at all
evident to me that the parse-tree is any better a place to start than
the system catalog representation; in fact, I would argue that it's
probably much worse, because you'll have to exactly replicate whatever
the backend did to decide what catalog entries to create, or you'll
get drift between servers.

 Also I have a *very hard* time to imagining to sensibly implement replicating
 CREATE TABLE or ALTER TABLE ... ADD COLUMN with just object access hooks.
 There is just not enough context. How would you discern between a ADD COLUMN
 and a CREATE TABLE? They look very similar or even identical on a catalog
 level.

That can be fixed using the optional argument to
InvokeObjectAccessHook, similar to what we've done for differentiating
internal drops from other drops.

 I also have the strong feeling that all this would expose implementation
 details *at least* as much as 

Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I am coming more and more strongly to the conclusion that we're going
 in the wrong direction here.  It seems to me that you're spending an
 enormous amount of energy implementing something that goes by the name
 COMMAND TRIGGER when what you really want is an EVENT TRIGGER.

No.

The following two features are not allowed by what you call an EVENT
TRIGGER yet the very reason why I started working on COMMAND TRIGGERs:

 - BEFORE COMMAND TRIGGER
 - Having the command string available in the command trigger

Now, because of scheduling, the current patch has been reduced not to
include the second feature yet, which is a good trade-off for now. Yet
it's entirely possible to implement such feature as an extension once
9.2 is out given current COMMAND TRIGGER patch.

 I realize this represents a radical design change from what you have
 right now, but what you have right now is messy and ill-defined and I

That's only because I've not been doing the hard choices alone, I wanted
to be able to speak about them here, and the only time that discussion
happen is when serious hand down code review is being done.

My take?  Let's make the hard decisions together. Mechanisms are
implemented. The plural is what is causing problems here, but that also
mean we can indeed implement several policies now.

I've been proposing a non-messy policy in a previous mail, which I
realize the patch is not properly implementing now. I'd think moving the
patch to implement said policy (or another one after discussion) is next
step.

 don't think you can easily fix it.  You're exposing great gobs of
 implementation details which means that, inevitably, every time anyone
 wants to refactor some code, the semantics of command triggers are
 going to change, or else the developer will have to go to great
 lengths to ensure that they don't.  I don't think either of those
 things is going to make anyone very happy.

I guess you can't really have your cake and eat it too, right?

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 I actually think that, to really meet all needs here, we may need the
 ability to get control in more than one place.  For example, one thing
 that KaiGai has wanted to do (and I bet Dimitri would live to be able
 to do it too, and I'm almost sure that Dan Farina would like to do it)
 is override the built-in security policy for particular commands.  I

I had that in a previous version of the patch, and removed it because
you were concerned about our ability to review it in time for 9.2, which
has obviously been a right decision.

That was called INSTEAD OF command triggers, and they could call a
SECURITY DEFINER function.

 I agree that it's not a very helpful decision, but I'm not the one who
 said we wanted command triggers rather than event triggers.  :-)

Color me unconvinced about event triggers. That's not answering my use
cases.

 that.  But how is that any different with Dimitri's approach?  You can
 get a callback AFTER CREATE TABLE, and you'll get the table name.  Now
 what?  If you get the trigger in C you can get the node tree, but
 that's hardly any better.  You're still going to need to do some
 pretty tricky push-ups to get reliable replication.  It's not at all

What you do with the parse tree is rewrite the command. It's possible to
do, but would entail exposing the internal parser state which Tom
objects too. I'm now thinking that can be maintained as a C extension.

 evident to me that the parse-tree is any better a place to start than
 the system catalog representation; in fact, I would argue that it's
 probably much worse, because you'll have to exactly replicate whatever
 the backend did to decide what catalog entries to create, or you'll
 get drift between servers.

Try to build a command string from the catalogs… even if you can store a
snapshot of them before and after the command. Remember that you might
want to “replicate” to things that are NOT a PostgreSQL server.

 ambiguity.  If you say that we're going to have a trigger on the
 CREATE SEQUENCE command, then what happens when the user creates a
 sequence via some other method?  The current patch says that we should
 handle that by calling the CREATE SEQUENCE trigger if it happens to be
 convenient because we're going through the same code path that a
 normal CREATE SEQUENCE would have gone through, but if it uses a
 different code path then let's not bother.  Otherwise, how do you

Yes, the current set of which commands fire which triggers is explained
by how the code is written wrt standard_ProcessUtility() calls. We could
mark re-entrant calls and disable the command trigger feature, it would
not be our first backend global variable in flight.

 Dimitri is not the first or last person to want to get control during
 DDL operations, and KaiGai's already done a lot of work figuring out
 how to make it work reasonably.  Pre-create hooks don't exist in that
 machinery not because nobody wants them, but because it's hard.  This

I've been talking with Kaigai about using the Command Trigger
infrastructure to implement its control hooks, while reviewing one of
his patches, and he said that's not low-level enough for him.

 whole problem is hard.  It would be wrong to paint it as a problem
 that is unsolvable or not valuable, but it would be equally wrong to
 expect that it's easy or that anyone's first attempt (mine, yours,
 Dimitri's, KaiGai's, or Tom Lane's) is going to fall painlessly into
 place without anyone needing to sweat a little blood.

Sweating over that feature is a good summary of a whole lot of my and
some others' time lately.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Andres Freund
Hi,

On Tuesday, March 27, 2012 04:29:58 PM Robert Haas wrote:
 On Tue, Mar 27, 2012 at 9:37 AM, Andres Freund and...@2ndquadrant.com 
wrote:
  Not necessarily. One use-case is doing something *before* something
  happens like enforcing naming conventions, data types et al during
  development/schema migration.
 
 That is definitely a valid use case.  The only reason why we don't
 have something like that built into the ObjectAccessHook framework is
 because we haven't yet figured out a clean way to make it work.   Most
 of KaiGai's previous attempts involved passing a bunch of garbage
 selected apparently at random into the hook function, and I rejected
 that as unmaintainable.  Dimitri's code here doesn't have that problem
 - it passes in a consistent set of information across the board.  But
 I still think it's unmaintainable, because there's no consistency
 about when triggers get invoked, or whether they get invoked at all.
 We need something that combines the strengths of both approaches.
Yes.

I still think the likeliest way towards that is defining some behaviour we want 
regarding
* naming conflict handling
* locking

And then religiously make sure the patch adheres to that. That might need some 
refactoring of existing code (like putting a art of the utility.c handling of 
create table into its own function and such), but should be doable.

I think BEFORE command triggers ideally should run
* before permission checks
* before locking
* before internal checks are done (nameing conflicts, type checks and such)

Obviously some things will be caught before that (parse analysis of some 
commands) and I think we won't be able to fully stop that, but its not totally 
consistent now and while doing some work in the path of this patch seems 
sensible it cannot do-over everything wrt this.

Looking up oids and such before calling the trigger doesn't seem to be 
problematic from my pov. Command triggers are a superuser only facility, 
additional information they gain are no problem.
Thinking about that I think we should enforce command trigger functions to be 
security definer functions.

 I actually think that, to really meet all needs here, we may need the
 ability to get control in more than one place.
Not sure what you mean by that. Before/after permission checks, before/after 
consistency checks? That sort of thing?

 For example, one thing
 that KaiGai has wanted to do (and I bet Dimitri would live to be able
 to do it too, and I'm almost sure that Dan Farina would like to do it)
 is override the built-in security policy for particular commands.
Dim definitely seems to want that: https://github.com/dimitri/pgextwlist ;)

 I think that KaiGai only wants to be able to deny things that would
 normally be allowed, but I suspect some of those other folks would
 also like to be able to allow things that would normally be denied.
Denying seems to be easier than allowing stuff safely

 For those use cases, you want to get control at permissions-checking
 time.  However, where Dimitri has the hooks right now, BEFORE triggers
 for ALTER commands fire after you've already looked up the object that
 you're manipulating.  That has the advantage of allowing you to use
 the OID of the object, rather than its name, to make policy decisions;
 but by that time it's too late to cancel a denial-of-access by the
 first-line permissions checks.
Why? Just throw a access denied exception? Unless its done after the locking 
that won't be visible by anything but timing?

Additional granting is more complex though, I am definitely with you there. 
That will probably need INSTEAD triggers which I don't see for now. Those will 
have their own share of problems.

 Dimitri also mentioned wanting to be able to cancel the actual operation - 
 and presumably, do something else instead, like go execute it on a different 
 node, and I think that's another valid use case for this kind of trigger.  
 It's going to take some work, though, to figure out what the right set of 
 control points is, and it's probably going to require some refactoring of
 the existing code, which is a mess that I have been slowly trying to clean
 up.
I commend your bravery...

  In the interest of getting event triggers, you're arguing that we should
  contort the definition of the work command to include sub-commands,
  but not necessarily commands that turn out to be a no-op, and maybe
  things that are sort of like what commands do even though nobody
  actually ever executed a command by that name.  I just don't think
  that's a good idea.  We either have triggers on commands, and they
  execute once per command, period; or we have triggers on events and
  they execute every time that event happens.
  I don't think thats a very helpfull definition. What on earth would you
  want to do with plain passing of
  CREATE SCHEMA blub CREATE TABLE foo() CREATE TABLE bar();
  So some decomposition seems to be necessary. Getting the level right sure
  ain't 

Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Robert Haas
On Tue, Mar 27, 2012 at 11:05 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 I agree that it's not a very helpful decision, but I'm not the one who
 said we wanted command triggers rather than event triggers.  :-)

 Color me unconvinced about event triggers. That's not answering my use
 cases.

Could we get a list of those use cases, maybe on a wiki page
somewhere, and add to it all the use cases that KaiGai Kohei or others
who are interested in this are aware of?  Or maybe we can just discuss
via email, but it's going to be hard to agree that we've got something
that meets the requirements or doesn't if we're all imagining
different sets of requirements.  The main use cases I can think of
are:

1. Replication.  That is, after we perform a DDL operation, we call a
trigger and tell it what we did, so that it can make a record of that
information and ship it to another machine, where it will arrange to
do the same thing on the remote side.

2. Redirection.  That is, before we perform a DDL operation, we call a
trigger and tell it what we've been asked to do, so that it can go
execute the request elsewhere (e.g. the master rather than the Hot
Standby slave).

3. Additional security or policy checks.  That is, before we perform a
DDL operation, we call a trigger and tell it what we've been asked to
do, so that it can throw an error if it doesn't like our credentials
or, say, the naming convention we've used for our column names.

4. Relaxation of security checks.  That is, we let a trigger get
control at permissions-checking time and let it make the go-or-no-go
decision in lieu of the usual permissions-checking code.

5. Auditing.  Either when something is attempted (i.e. before) or
after it happens, we log the attempt/event somewhere.

Anything else?

In that list of use cases, it seems to me that you want BEFORE and
AFTER triggers to have somewhat different firing points.  For the
BEFORE cases, you really need the command trigger to fire early, and
once.  For example, if someone says ALTER TABLE dimitri ADD COLUMN
fontaine INTEGER, DROP COLUMN IF EXISTS haas, permissions on dimitri
should really only get checked once, not once for each subcommand.
That's the point at which you need to get control for #3 or #4, and it
would be workable for #5 as well; I'm less sure about #2.  On the
other hand, for the AFTER cases I've listed here, I think you really
want to know what *actually* happened, not what somebody thought about
doing.  You want to know which tables, sequences, etc. *actually* got
created or dropped, not the ones that the user mentioned.  If the user
mentioned a table but we didn't drop it (and we also didn't error out,
because IF EXISTS) is used, none of the AFTER cases really care; if we
dropped other stuff (because of CASCADE) the AFTER cases may very well
care.

Another thing to think about with respect to deciding on the correct
firing points is that you can't fire easily the trigger after we've
identified the object in question but before we've checked permissions
on it, which otherwise seems like an awfully desirable thing to do for
use cases 3, 4, and 5 from the above list, and maybe 2 as well.  We
don't want to try to take locks on objects that the current user
doesn't have permission to access, because then a user with no
permissions whatsoever on an object can interfere with access by
authorized users.  On the flip side, we can't reliably check
permissions before we've locked the object, because somebody else
might rename or drop it after we check permissions and before we get
the lock.  Noah Misch invented a clever technique that I then used to
fix a bunch of these problems in 9.2: the fixed code (sadly, not all
cases are fixed yet, due to the fact that we ran out of time in the
development cycle) looks up the object name, checks permissions
(erroring out if the check fails), and then locks the object.  Once it
gets the lock, it checks whether any shared-invalidation messages have
been processed since the point just before we looked up the object
name.  If so, it redoes the name lookup.  If the referrent of the name
has not changed, we're done; if it has, we drop the old lock and
relock the new object and loop around again, not being content until
we're sure that the object we locked is still the referrant of the
name.  This leads to much more logical behavior than the old way of
doing things, and not incidentally gets rid of a lot of errors of the
form cache lookup failed for relation %u that users of existing
releases will remember, probably not too fondly.

However, it's got serious implications for triggers that want to relax
security policy, because the scope of what you can do inside that loop
is pretty limited.  You can't really do anything to the relation while
you're checking permissions on it, because you haven't locked it yet.
If you injected a trigger there, it would have to be similarly
limited, and I don't know how we'd enforce that, and it would have to
be prepared to 

Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Robert Haas
On Tue, Mar 27, 2012 at 11:58 AM, Andres Freund and...@2ndquadrant.com wrote:
 I still think the likeliest way towards that is defining some behaviour we 
 want
 regarding
 * naming conflict handling
 * locking

 And then religiously make sure the patch adheres to that. That might need some
 refactoring of existing code (like putting a art of the utility.c handling of
 create table into its own function and such), but should be doable.

 I think BEFORE command triggers ideally should run
 * before permission checks
 * before locking
 * before internal checks are done (nameing conflicts, type checks and such)

It is possible to do this, and it would actually be much easier and
less invasive to implement than what Dimitri has done here, but the
downside is that you won't have done the name lookup either.  See my
last email to Dimitri for a long explanation of why that restriction
is not easily circumventable: name lookups, locking, and permission
checks are necessarily and inextricably intertwined.  Still, if others
agree this is useful, I think it would be a lot easier to get right
than what we have now.

 Obviously some things will be caught before that (parse analysis of some
 commands) and I think we won't be able to fully stop that, but its not totally
 consistent now and while doing some work in the path of this patch seems
 sensible it cannot do-over everything wrt this.

Some of what we're now doing as part of parse analysis should really
be reclassified.  For example, the ProcessUtility branch that handles
T_CreateStmt and T_CreateForeignTableStmt should really be split out
as a separate function that lives in tablecmds.c, and I think at least
some of what's in transformCreateStmt should be moved to tablecmds.c
as well.  The current arrangement makes it really hard to guarantee
things like the name gets looked up just once, which seems obviously
desirable, since strange things will surely happen if the whole
command doesn't agree on which OID it's operating on.

 Looking up oids and such before calling the trigger doesn't seem to be
 problematic from my pov. Command triggers are a superuser only facility,
 additional information they gain are no problem.
 Thinking about that I think we should enforce command trigger functions to be
 security definer functions.

I don't see any benefit from that restriction.

 I actually think that, to really meet all needs here, we may need the
 ability to get control in more than one place.
 Not sure what you mean by that. Before/after permission checks, before/after
 consistency checks? That sort of thing?

Yes.  For example, above you proposed a kind of trigger that fires
really early - basically after parsing but before everything else.
What Dimitri has implemented is a much later trigger that is still
before the meat of the command, but not before *everything*.  And then
there's an AFTER trigger, which could fire either after an individual
piece or stage of the command, or after the whole command is complete,
either of which seems potentially useful depending on the
circumstances.  I almost think that the BEFORE/AFTER name serves to
confuse rather than to clarify, in this case.  It's really a series of
specific hook points: whenever we parse a command (but before we
execute it), after security and sanity checks but before we begin
executing the command, before or after various subcommands, after the
whole command is done, and maybe a few others.  When we say BEFORE or
AFTER, we implicitly assume that we want at most two of the things
from that list, and I am not at all sure that's what going to be
enough.

One thing I had thought about doing, in the context of sepgsql, and we
may yet do it, is create a hook that gets invoked whenever we need to
decide whether it's OK to perform an action on an object.  For
example, consider ALTER TABLE .. ADD FOREIGN KEY: we'd ask the hook
both is it OK to add a foreign key to table X? and also is it OK to
make a foreign key refer to table Y?  This doesn't fit into the
command-trigger framework at all, but it's definitely useful for
sepgsql, and I bet it's good for other things, too - maybe not that
specific example, but that kind of thing.

 I think that KaiGai only wants to be able to deny things that would
 normally be allowed, but I suspect some of those other folks would
 also like to be able to allow things that would normally be denied.
 Denying seems to be easier than allowing stuff safely

Yes.

 For those use cases, you want to get control at permissions-checking
 time.  However, where Dimitri has the hooks right now, BEFORE triggers
 for ALTER commands fire after you've already looked up the object that
 you're manipulating.  That has the advantage of allowing you to use
 the OID of the object, rather than its name, to make policy decisions;
 but by that time it's too late to cancel a denial-of-access by the
 first-line permissions checks.
 Why? Just throw a access denied exception? Unless its done after the 

Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Andres Freund
Hi,

On Tuesday, March 27, 2012 07:34:46 PM Robert Haas wrote:
 On Tue, Mar 27, 2012 at 11:58 AM, Andres Freund and...@2ndquadrant.com 
wrote:
  I still think the likeliest way towards that is defining some behaviour
  we want regarding
  * naming conflict handling
  * locking
  
  And then religiously make sure the patch adheres to that. That might need
  some refactoring of existing code (like putting a art of the utility.c
  handling of create table into its own function and such), but should be
  doable.
  
  I think BEFORE command triggers ideally should run
  * before permission checks
  * before locking
  * before internal checks are done (nameing conflicts, type checks and
  such)
 It is possible to do this, and it would actually be much easier and
 less invasive to implement than what Dimitri has done here, but the
 downside is that you won't have done the name lookup either.  See my
 last email to Dimitri for a long explanation of why that restriction
 is not easily circumventable: name lookups, locking, and permission
 checks are necessarily and inextricably intertwined.
Read your other mail. Comming back to it I think the above might be to 
restrictive to be usefull for a big part of use-cases. So your idea of more 
hook points below has some merits.

 Still, if others
 agree this is useful, I think it would be a lot easier to get right
 than what we have now.
I think its pretty important to have something thats usable rather easily 
which requires names to be resolved and thus permission checks already 
performed and (some) locks already acquired.
I think quite some of the possible usages need the facility to be as simple as 
possible and won't care about already acquired locks/names.


 Some of what we're now doing as part of parse analysis should really
 be reclassified.  For example, the ProcessUtility branch that handles
 T_CreateStmt and T_CreateForeignTableStmt should really be split out
 as a separate function that lives in tablecmds.c, and I think at least
 some of what's in transformCreateStmt should be moved to tablecmds.c
 as well.  The current arrangement makes it really hard to guarantee
 things like the name gets looked up just once, which seems obviously
 desirable, since strange things will surely happen if the whole
 command doesn't agree on which OID it's operating on.
Yes, I aggree, most of that should go from utility.c.

  Looking up oids and such before calling the trigger doesn't seem to be
  problematic from my pov. Command triggers are a superuser only facility,
  additional information they gain are no problem.
  Thinking about that I think we should enforce command trigger functions
  to be security definer functions.
 I don't see any benefit from that restriction.
The reason I was thinking it might be a good idea is that they get might get 
more knowledge passed than the user could get directly otherwise. Especially 
if we extend the scheme to more places/informations.

  I actually think that, to really meet all needs here, we may need the
  ability to get control in more than one place.
  Not sure what you mean by that. Before/after permission checks,
  before/after consistency checks? That sort of thing?
 Yes.  For example, above you proposed a kind of trigger that fires
 really early - basically after parsing but before everything else.
 What Dimitri has implemented is a much later trigger that is still
 before the meat of the command, but not before *everything*.  And then
 there's an AFTER trigger, which could fire either after an individual
 piece or stage of the command, or after the whole command is complete,
 either of which seems potentially useful depending on the
 circumstances.  I almost think that the BEFORE/AFTER name serves to
 confuse rather than to clarify, in this case.  It's really a series of
 specific hook points: whenever we parse a command (but before we
 execute it), after security and sanity checks but before we begin
 executing the command, before or after various subcommands, after the
 whole command is done, and maybe a few others.  When we say BEFORE or
 AFTER, we implicitly assume that we want at most two of the things
 from that list, and I am not at all sure that's what going to be
 enough.
You might have a point there. Not sure if the complexity of explaining all the 
different hook points is worth the pain.

What are the phases we have:

* after parse
  * logging
* after resolving name
* after checking permisssions  (interwined with the former)
  * override permission check? INSTEAD?
* after locking (interwined with the former)
  * except it needs to be connected to resolving the name/permission check 
this doesn't seem to be an attractive hook point
* after validation
  * additional validation likely want to hook here
* after execution
  * replication might want to hook here

Am I missing an interesting phase?

Allowing all that probably seems to require a substantial refactoring of 
commands/ and catalog/

 One thing I had 

Re: [HACKERS] Command Triggers patch v18

2012-03-27 Thread Robert Haas
On Tue, Mar 27, 2012 at 4:05 PM, Andres Freund and...@2ndquadrant.com wrote:
  Looking up oids and such before calling the trigger doesn't seem to be
  problematic from my pov. Command triggers are a superuser only facility,
  additional information they gain are no problem.
  Thinking about that I think we should enforce command trigger functions
  to be security definer functions.
 I don't see any benefit from that restriction.
 The reason I was thinking it might be a good idea is that they get might get
 more knowledge passed than the user could get directly otherwise. Especially
 if we extend the scheme to more places/informations.

As long as the superuser gets to decide whether or not a given
function is installed as a command trigger, there's no harm in
allowing him to make it either SECURITY INVOKER or SECURITY DEFINER.
I agree that making it SECURITY DEFINER will often be useful and
appropriate; I just see no reason to enforce that restriction
programatically.

 What are the phases we have:

 * after parse
  * logging
 * after resolving name
 * after checking permisssions  (interwined with the former)
  * override permission check? INSTEAD?
 * after locking (interwined with the former)
  * except it needs to be connected to resolving the name/permission check
 this doesn't seem to be an attractive hook point
 * after validation
  * additional validation likely want to hook here
 * after execution
  * replication might want to hook here

 Am I missing an interesting phase?

I'd sort of categorize it like this:

- pre-parse
- post-parse
- at permissions-checking time
- post permissions-checking/name-lookup, before starting the main work
of the command, i.e. pre-execution
- event type triggers that happen when specific actions are
performed (e.g. CREATE, DROP, etc.) or as subcommands fire (e.g. in
ALTER TABLE, we could fire an alter-table-subcommand trigger every
time we execute a subcommand)
- post-execution

 Allowing all that probably seems to require a substantial refactoring of
 commands/ and catalog/

Probably.  But we don't need to allow it all at once.  What we need to
do is pick one or two things that are relatively easily done,
implement those first, and then build on it.  Pre-parse, post-parse,
CREATE-event, DROP-event, and post-execution hooks are all pretty easy
to implement without major refactoring, I think.  OTOH,
post-permissions-checking-pre-execution is going to be hard to
implement without refactoring, and ALTER-event hooks are going to be
hard, too.

 I think you need a surprisingly high amount of context to know when to ignore
 a trigger. Especially as its not exactly easy to transfer knowledge from one
 to the next.

I'm not convinced, but maybe it would be easier to resolve this in the
context of a specific proposal.

 I don't think creating *new* DDL from the parsetree can really count as
 statement based replication. And again, I don't think implementing that will
 cost that much effort.
 How would it help us to know - as individual events! - which tuples where
 created where? Reassembling that will be a huge mess. I basically fail to see
 *any* use case besides access checking.

I think if you'd said this to me two years ago, I would have believed
you, but poking through this code in the last year or two, I've come
to the conclusion that there are a lot of subtle things that get
worked out after parse time, during execution.  Aside from things like
recursing down the inheritance hierarchy and maybe creating some
indexes or sequences when creating a table, there's also subtle things
like working out exactly what index we're creating: name, access
method, operator class, collation, etc.  And I'm pretty sure that if
we examine the code carefully we'll find there are a bunch more.

 I fear a bit that this discussion is leading to something thats never going to
 materialize because it would require a huge amount of work to get there.

Again, the way to avoid that is to tackle the simple cases first and
then work on the harder cases after that, but I don't think that's
what the current patch is doing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Robert Haas
On Sun, Mar 25, 2012 at 12:15 PM, Andres Freund and...@2ndquadrant.com wrote:
 On Friday, March 16, 2012 10:40:46 AM Dimitri Fontaine wrote:
  This will have the effect of calling triggers outside of alphabetic
  order. I don't think thats a good idea even if one part is ANY and the
  other a specific command.
  I don't think there is any reason anymore to separate the two? The only

  callsite seems to look like:
 The idea is to have a predictable ordering of command triggers. The code
 changed in the patch v16 (you pasted code from git in between v15 and
 v16, I cleaned it up) and is now easier to read:

                 case CMD_TRIGGER_FIRED_BEFORE:
                         whenstr = BEFORE;
                         procs[0] = cmd-before_any;
                         procs[1] = cmd-before;
                         break;

                 case CMD_TRIGGER_FIRED_AFTER:
                         whenstr = AFTER;
                         procs[0] = cmd-after;
                         procs[1] = cmd-after_any;
                         break;

 So it's BEFORE ANY then BEFORE command then AFTER command then AFTER
 ANY. That's an arbitrary I made and we can easily reconsider. Triggers
 are called in alphabetical order in each “slot” here.

 In my mind it makes sense to have ANY triggers around the specific
 triggers, but it's hard to explain why that feels better.
 I still think this would be a mistake. I don't have a hard time imagining
 usecases where a specific trigger should be called before or after an ANY
 trigger because e.g. it wants to return a more specific error or doesn't want
 to check all preconditions already done by the ANY trigger... All that would
 be precluded by enforcing a strict ordering between ANY and specific triggers.
 I don't see a use-case that would benefit from the current behaviour...

Dimitri's proposed behavior would be advantageous if you have an ANY
trigger that wants to take over the world and make sure that nobody
else can run before it.  I think, though, that's not a case we want to
cater to - all of this stuff requires superuser privileges anyway, so
we should assume that the DBA knows what he's doing.  So +1 for making
it strictly alphabetical, as we do with other triggers.  Everything
that can be done under Dimitri's proposal can also be done in that
scheme, but the reverse is not true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 Dimitri's proposed behavior would be advantageous if you have an ANY
 trigger that wants to take over the world and make sure that nobody
 else can run before it.  I think, though, that's not a case we want to
 cater to - all of this stuff requires superuser privileges anyway, so
 we should assume that the DBA knows what he's doing.  So +1 for making

What about extensions?

One use case would be for londiste/slony/bucardo to rewrite the command
and queue its text, that will be done in C and should probably be done
first. Using an ANY command trigger means that code will run before user
specific code (or likewise after it).

As I said it's not that clear in my head, but when thinking about
command trigger and extensions, it could be better to impose an
arbitrary order here.

 it strictly alphabetical, as we do with other triggers.  Everything
 that can be done under Dimitri's proposal can also be done in that
 scheme, but the reverse is not true.

That's true too. I'm just not sure how much it applies to code installed
by the DBA as opposed to written by the DBA. I'll be happy to edit the
patch to melt both lists if that's the decision, it's not hard to do so.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Thom Brown
Can someone clarify whether this will be reviewed by a committer?
Will there be time to get this reviewed before the commitfest closes?
I get the impression the commitfest closure is fairly imminent.

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Robert Haas
On Mon, Mar 26, 2012 at 3:24 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 One use case would be for londiste/slony/bucardo to rewrite the command
 and queue its text, that will be done in C and should probably be done
 first. Using an ANY command trigger means that code will run before user
 specific code (or likewise after it).

And, if the user wants it to be run first, he or she can do that.  But
suppose he wants to run it first, and also forbid users whose username
starts with the letter b from running the ANALYZE command.  Well,
then, he now wants that trigger to come before the other one, even
though the Slony trigger is for all commands (maybe) and the other
just for ANALYZE (maybe).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Robert Haas robertmh...@gmail.com writes:
 Dimitri's proposed behavior would be advantageous if you have an ANY
 trigger that wants to take over the world and make sure that nobody
 else can run before it.  I think, though, that's not a case we want to
 cater to - all of this stuff requires superuser privileges anyway, so
 we should assume that the DBA knows what he's doing.  So +1 for making

 What about extensions?

 One use case would be for londiste/slony/bucardo to rewrite the command
 and queue its text, that will be done in C and should probably be done
 first. Using an ANY command trigger means that code will run before user
 specific code (or likewise after it).

Unless you intend a restriction that there be only one ANY trigger,
I don't see how that follows.  AFAICS, the only way to guarantee my
trigger runs first is to give it a name alphabetically before anything
else in the system.

Also, it strikes me that in most of the trigger ordering cases I've
seen, it's actually more interesting to want to be sure that a
particular trigger runs *last* so that its effects can't be modified
by some other, hypothetically less trusted, trigger.

So I don't think that the mere fact of being an ANY trigger rather than
a command-specific trigger should be taken to mean that a particular
ordering is desirable.  Trigger name order isn't the greatest solution
by any means, but it's more flexible than hard-wiring according to
trigger type.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Tom Lane
Thom Brown thombr...@gmail.com writes:
 Can someone clarify whether this will be reviewed by a committer?
 Will there be time to get this reviewed before the commitfest closes?
 I get the impression the commitfest closure is fairly imminent.

I don't have that impression.  (I wish I did.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Robert Haas
On Mon, Mar 26, 2012 at 3:36 PM, Thom Brown thombr...@gmail.com wrote:
 Can someone clarify whether this will be reviewed by a committer?
 Will there be time to get this reviewed before the commitfest closes?
 I get the impression the commitfest closure is fairly imminent.

Well, I have been holding off for two reasons:

1. It sure seems like there is an awful lot of code churn and design
work going on.

2. I'm not sure which patches Tom is planning to look at or in what
order, so I've been avoiding the ones he seems to be taking an
interest in.

Personally, I am about at the point where I'd like to punt everything
and move on.  As nice as it would be to squeeze a few more things into
9.2, there WILL be a 9.3.  If a few less people had submitted
half-baked code at the last minute and a few more people had helped
with review, we'd be done by now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 So I don't think that the mere fact of being an ANY trigger rather than
 a command-specific trigger should be taken to mean that a particular
 ordering is desirable.  Trigger name order isn't the greatest solution
 by any means, but it's more flexible than hard-wiring according to
 trigger type.

That ANY sandwich idea is then dead, I will fix it tomorrow to rather
just handle a single list of BEFORE and AFTER triggers (that's 2 lists
total) ordered by trigger name.

v19 will also integrate latest doc comments from Thom and most from
Andres, I don't know how to fix the plpython specifics he's talking
about.

About the reviewing and the commit fest closing, even if that patch is
big it's a principled implementation: the integration of the facility is
done in the same way in lots of different places, and is not rocket
science either (we removed all the complexity). So I guess it's not
really an herculean job here, just a certain amount of mechanical edits:
we just support too many commands ;)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Andres Freund
On Monday, March 26, 2012 10:18:59 PM Dimitri Fontaine wrote:
  don't know how to fix the plpython specifics he's talking
 about.
Just copy what is done in the normal trigger handling facility (the decref 
both in the CATCH and in the normal path). Ping me some other way if you need 
help...

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 1. It sure seems like there is an awful lot of code churn and design
 work going on.

There has only been minor adjustments for a while now, and they have
been visible because Thom was doing lots of testing for me and it was
way easier for me to publish a new version and have a test result the
next day (thanks again Thom).

 Personally, I am about at the point where I'd like to punt everything
 and move on.  As nice as it would be to squeeze a few more things into
 9.2, there WILL be a 9.3.  If a few less people had submitted
 half-baked code at the last minute and a few more people had helped
 with review, we'd be done by now.

Well, wait a minute. There's a difference between half-baked and
reacting to a review that changes the goal of a patch. My idea of the
code I wanted to write here is extremely different from what we as a
community decided to be doing. The main part of the code churn has been
answering to review, removing features and cleaning the code afterwards.

The only major design decision that I had to change here has been about
from where to call in the command trigger code in the existing commands
implementation, and it was done before entering this CF, IIRC.

If you want to punt this patch out of 9.2 after all the changes I had to
make for it to be a candidate for 9.2, I think it would be only fair for
you to find a show stopper in my current implementation. The trigger
firing order is about an hour of work, so not a stopper I believe.

And as soon as we're done here, you know I'll put the same hours and
energy into reviewing other people patches :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Thom Brown
On 26 March 2012 21:36, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 Personally, I am about at the point where I'd like to punt everything
 and move on.  As nice as it would be to squeeze a few more things into
 9.2, there WILL be a 9.3.  If a few less people had submitted
 half-baked code at the last minute and a few more people had helped
 with review, we'd be done by now.

 Well, wait a minute. There's a difference between half-baked and
 reacting to a review that changes the goal of a patch. My idea of the

I think Robert was referring to patches in general.

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Robert Haas
On Mon, Mar 26, 2012 at 4:36 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Well, wait a minute. There's a difference between half-baked and
 reacting to a review that changes the goal of a patch. My idea of the
 code I wanted to write here is extremely different from what we as a
 community decided to be doing. The main part of the code churn has been
 answering to review, removing features and cleaning the code afterwards.

Sure.  And if this weren't the final CommitFest, we would have bumped
this to the next CommitFest two months ago.  Since it is the final
CommitFest, we're going to go on and on and on.  Already we have
almost as many patches queued up for the next CommitFest as we do
remaining in this one.  So my question is: how long should we keep all
those people waiting for the purpose of squeezing more stuff into 9.2?
 At the beginning of this CommitFest, various people offered time
frames ranging between 6 weeks to 2 months.  We're past that now.  If
you don't think that was a long enough time frame, then how long do
you think we should wait?   It doesn't seem to me to matter very much
whether this got stretched out due to fundamental design deficiencies
or just because it takes a while to beat a major feature into
committable form; surely it's not a shock that this patch wasn't going
to go in overnight.

 If you want to punt this patch out of 9.2 after all the changes I had to
 make for it to be a candidate for 9.2, I think it would be only fair for
 you to find a show stopper in my current implementation. The trigger
 firing order is about an hour of work, so not a stopper I believe.

I don't think there is a show-stopper.  I do think there is probably a
lot more cleaning up, tidying, and adjusting needed.

 And as soon as we're done here, you know I'll put the same hours and
 energy into reviewing other people patches :)

As soon as we're done here, the CommitFest will end, and there won't
be any other people's patches to review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Christopher Browne
On Mon, Mar 26, 2012 at 4:45 PM, Robert Haas robertmh...@gmail.com wrote:
 As soon as we're done here, the CommitFest will end, and there won't
 be any other people's patches to review.

Hurray?  :-)
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, How would the Lone Ranger handle this?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 2. I'm not sure which patches Tom is planning to look at or in what
 order, so I've been avoiding the ones he seems to be taking an
 interest in.

Well, I think I'm definitely on the hook for the pg_stat_statements,
pgsql_fdw, foreign table stats, and caching-stable-subexpressions
patches, and I should look at the libpq alternate row returning
mechanism because I suspect I was the last one to mess with that libpq
code in any detail.  I don't claim any special insight into the other
stuff on the list.  In particular I've not been paying much attention
to command triggers.

 Personally, I am about at the point where I'd like to punt everything
 and move on.  As nice as it would be to squeeze a few more things into
 9.2, there WILL be a 9.3.  If a few less people had submitted
 half-baked code at the last minute and a few more people had helped
 with review, we'd be done by now.

The main reason I proposed setting a schedule a few weeks ago was that
I was afraid the commitfest would otherwise end precisely in a we're
tired out, we're punting everything to 9.3 moment.  Without some
definite goal to work towards, it'll just keep stretching out until
we've had enough.  I'd prefer it end in a more orderly fashion than
that.  The end result will be the same, in the sense that some of the
stuff that's still-not-ready-for-committer is going to get punted,
but people might have a less bad taste in their mouths about why.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-26 Thread Robert Haas
On Mar 26, 2012, at 5:36 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 2. I'm not sure which patches Tom is planning to look at or in what
 order, so I've been avoiding the ones he seems to be taking an
 interest in.
 
 Well, I think I'm definitely on the hook for the pg_stat_statements,
 pgsql_fdw, foreign table stats, and caching-stable-subexpressions
 patches, and I should look at the libpq alternate row returning
 mechanism because I suspect I was the last one to mess with that libpq
 code in any detail.  I don't claim any special insight into the other
 stuff on the list.  In particular I've not been paying much attention
 to command triggers.

How long will that all take?

I guess I'll work on command triggers, pg_archivecleanup, and buffer I/O 
timings next.

 
 Personally, I am about at the point where I'd like to punt everything
 and move on.  As nice as it would be to squeeze a few more things into
 9.2, there WILL be a 9.3.  If a few less people had submitted
 half-baked code at the last minute and a few more people had helped
 with review, we'd be done by now.
 
 The main reason I proposed setting a schedule a few weeks ago was that
 I was afraid the commitfest would otherwise end precisely in a we're
 tired out, we're punting everything to 9.3 moment.  Without some
 definite goal to work towards, it'll just keep stretching out until
 we've had enough.  I'd prefer it end in a more orderly fashion than
 that.  The end result will be the same, in the sense that some of the
 stuff that's still-not-ready-for-committer is going to get punted,
 but people might have a less bad taste in their mouths about why.

Fine. What do you propose, specifically?

...Robert
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-26 Thread Robert Haas
On Tue, Mar 20, 2012 at 1:49 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Hi,

 I guess I sent v17 a little early considering that we now already have
 v18 including support for CREATE TABLE AS and SELECT INTO, thanks to the
 work of Andres and Tom.

 There was some spurious tags in the sgml files in v17 that I did clean
 up too.

There are spurious whitespace changes in the documentation.  Some of
these are of the following form:

- return_arr
+ return_arr

create_trigger.sgml adds a stray blank line as well.

I think that the syntax for enabling or disabling a command trigger
should not use the keyword SET.  So just:

ALTER COMMAND TRIGGER name ENABLE [ ALWAYS | REPLICA ];
ALTER COMMAND TRIGGER name DISABLE;

That way is more parallel with the existing syntax for ordinary triggers.

+  The name to give the new trigger. This must be distinct from the name
+  of any other trigger for the same table. The name cannot be
+  schema-qualified.

For the same table is a copy-and-pasteo.

-   /* Look up object address. */
+   /* Look up object address. */

Spurious diff.

I think that get_object_address_cmdtrigger should be turned into a
case within get_object_address_unqualified; I can't see a reason for
it to have its own function.

+   elog(ERROR, could not find tuple for command trigger
%u, trigOid);

The standard phrasing is cache lookup failed for command trigger %u.

+/*
+ * Functions to execute the command triggers.
+ *
+ * We call the functions that matches the command triggers definitions in
+ * alphabetical order, and give them those arguments:
+ *
+ *   command tag, text
+ *   objectId, oid
+ *   schemaname, text
+ *   objectname, text
+ *
+ */

This will not survive pgindent, unless you surround it with -- and
-.  More generally, it would be nice if you could pgindent the
whole patch and then fix anything that breaks in such a way that it
will survive the next pgindent run.

+ * if (CommandFiresTriggers(cmd)

Missing paren.

+   /* before or instead of command trigger might have cancelled
the command */

Leftovers.

@@ -169,7 +169,7 @@ ExplainQuery(ExplainStmt *stmt, const char *queryString,
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 errmsg(EXPLAIN option BUFFERS
requires ANALYZE)));
-
+
/* if the timing was not set explicitly, set default value */
es.timing = (timing_set) ? es.timing : es.analyze;

Spurious diff.

All the changes to ruleutils.c appear spurious.  Ditto the change to
src/test/regress/sql/triggers.sql.

In the pg_dump support, it seems you're going to try to execute an
empty query if this is run against a pre-9.2 server.  It seems like
you should just return, or something like that.  What do we do in
comparable cases elsewhere?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-26 Thread Robert Haas
On Mon, Mar 26, 2012 at 9:11 PM, Robert Haas robertmh...@gmail.com wrote:
 [ various trivial issues ]

OK, now I got that out of my system.  Now on to bigger topics.

I am extremely concerned about the way in which this patch arranges to
invoke command triggers.  You've got call sites spattered all over the
place, and I think that's going to be, basically, an unfixable mess
and a breeding ground for bugs.  On a first read-through:

1. BEFORE ALTER TABLE triggers are fired in AlterTable().  However, an
ALTER TABLE statement does not necessarily call AlterTable() once and
only once.  The real top-level logic for AlterTable is in
ProcessUtility(), which runs transformAlterTableStmt() to generate a
list of commands and then either calls AlterTable() or recursively
invokes ProcessUtility() for each one.  This means that if IF EXISTS
is used and the table does not exist, then BEFORE command triggers
won't get invoked at all.  On the other hand, if multiple commands are
specified, then I think AlterTable() may get invoked either once or
more than once, depending on exactly which commands are specified; and
we might manage to fire some CREATE INDEX command triggers or whatnot
as well, again depending on exactly what that ALTER TABLE command is
doing.

2. BEFORE CREATE TABLE triggers seem to have similar issues; see
transformCreateStmt.

rhaas=# create table foo (a serial primary key);
NOTICE:  CREATE TABLE will create implicit sequence foo_a_seq for
serial column foo.a
NOTICE:  snitch: BEFORE CREATE SEQUENCE public.foo_a_seq [NULL]
NOTICE:  snitch: AFTER CREATE SEQUENCE public.foo_a_seq [16392]
NOTICE:  snitch: BEFORE CREATE TABLE public.foo [NULL]
NOTICE:  snitch: BEFORE CREATE INDEX public.foo_pkey [NULL]
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
foo_pkey for table foo
NOTICE:  snitch: AFTER CREATE INDEX public.foo_pkey [16398]
NOTICE:  snitch: BEFORE ALTER SEQUENCE public.foo_a_seq [16392]
NOTICE:  snitch: AFTER ALTER SEQUENCE public.foo_a_seq [16392]
NOTICE:  snitch: AFTER CREATE TABLE public.foo [16394]
CREATE TABLE

3. RemoveRelations() and RemoveObjects() similarly take the position
that when the object does not exist, command triggers need not fire.
This seems entirely arbitrary.  CREATE EXTENSION IF NOT EXISTS,
however, takes the opposite (and probably correct) position that even
if we decide not to create the extension, we should still fire command
triggers.  In a similar vein, AlterFunctionOwner_oid() skips firing
the command triggers if the old and new owners happen to be the same,
but other forms of ALTER FUNCTION (e.g. ALTER FUNCTION .. COST) fire
triggers even if the old and new parameters are the same; and
AlterForeignDataWrapperOwner_internal does NOT skip firing command
triggers just because the old and new owners are the same.

4. RemoveRelations() and RemoveObjects() also take the position that a
statement like DROP TABLE foo, bar should fire each relevant BEFORE
command trigger twice, then drop both objects, then fire each relevant
AFTER command trigger twice.  I think that's wrong.  It's 100% clear
that the user executed one, and only one, command.  The only real
argument for it is that if we were to only fire command triggers once,
we wouldn't be able to pass identifying information about the objects
being dropped, since the argument-passing mechanism only has room to
pass details about a single object.  I think that means that the
argument-passing mechanism needs improvement, not that we should
redefine one command as two commands.  Something like CREATE SCHEMA
foo CREATE VIEW bar ... has the same problem: the user only entered
one command, but we're firing command triggers as if they entered
multiple commands.  This case is possibly more defensible than the
other one, but note that the two aren't consistent with each other as
regards the order of trigger firing, and I actually think that they're
both wrong, and that only the toplevel command should fire triggers.

5. It seems desirable for BEFORE command triggers to fire at a
consistent point during command execution, but currently they don't.
For example, BEFORE DROP VIEW triggers don't fire until we've verified
that q exists, is a view, and that we have permission to drop it, but
LOAD triggers fire much earlier, before we've really checked anything
at all.  And ALTER TABLE is somewhere in the middle: we fire the
BEFORE trigger after checking permissions on the main table, but
before all permissions checks are done, viz:

rhaas= alter table p add foreign key (a) references p2 (a);
NOTICE:  snitch: BEFORE ALTER TABLE public.p [16418]
ERROR:  permission denied for relation p2

6. Another pretty hideous aspect of the CREATE TABLE behavior is that
AFTER triggers are fired from a completely different place than BEFORE
triggers.   It's not this patch's fault that our CREATE TABLE and
ALTER TABLE code is a Rube Goldberg machine, but finding some place to
jam each trigger invocation in that happens to (mostly) work as the

Re: [HACKERS] Command Triggers patch v18

2012-03-25 Thread Andres Freund
On Friday, March 23, 2012 04:32:02 PM Dimitri Fontaine wrote:
 I would like to get back on code level review now if at all possible,
 and I would integrate your suggestions here into the next patch revision
 if another one is needed.
Ok, I will give it another go.

Btw I just wanted to alert you to being careful when checking in the expect 
files ;)

 NOTICE:  snitch: BEFORE any DROP TRIGGER
-ERROR:  unexpected name list length (3)
+NOTICE:  snitch: BEFORE DROP TRIGGER NULL foo_trigger
+NOTICE:  snitch: AFTER any DROP TRIGGER
 create conversion test for 'utf8' to 'sjis' from utf8_to_sjis;
j

you had an apparerently un-noticed error in there ;)


1.
if (!HeapTupleIsValid(tup))
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_OBJECT),
 errmsg(command trigger \%s\ does not exist, 
skipping,
trigname)));
The skipping part looks like a copy/pasto...

2.
In PLy_exec_command_trigger youre doing a PG_TRY() which looks pointless in 
the current incarnation. Did you intend to add something in the catch?
I think without doing a decref of pltdata both in the sucess and the failure 
path youre leaking memory.

3.
In plpython: Why do you pass objectId/pltobjectname/... as NULL instead of 
None? Using a string for it seems like a bad from of in-band signalling to me.

4. 
Not sure whether InitCommandContext is the best place to suppress command 
trigger usage for some commands. That seems rather unintuitive to me. But 
perhaps the implementation-ease is big enough...

Thats everything new I found... Not bad I think. After this somebody else 
should take a look at I think (commiter or not).

 The only point yet to address from last round from Andres is about the
 API around CommandFiresTrigger() and the Memory Context we use here.
 We're missing an explicit Reset call, and to be able to have we need to
 have a more complex API, because of the way RemoveObjects() and
 RemoveRelations() work.
 
 We would need to add no-reset APIs and an entry point to manually reset
 the memory context, which currently gets disposed at the same time as
 its parent context, the current one that's been setup before entering
 standard_ProcessUtility().
Not sure if youre expecting further input from me about that?

Greetings,

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-25 Thread Andres Freund
On Friday, March 16, 2012 10:40:46 AM Dimitri Fontaine wrote:
  This will have the effect of calling triggers outside of alphabetic
  order. I don't think thats a good idea even if one part is ANY and the
  other a specific command.
  I don't think there is any reason anymore to separate the two? The only
 
  callsite seems to look like:
 The idea is to have a predictable ordering of command triggers. The code
 changed in the patch v16 (you pasted code from git in between v15 and
 v16, I cleaned it up) and is now easier to read:
 
 case CMD_TRIGGER_FIRED_BEFORE:
 whenstr = BEFORE;
 procs[0] = cmd-before_any;
 procs[1] = cmd-before;
 break;
 
 case CMD_TRIGGER_FIRED_AFTER:
 whenstr = AFTER;
 procs[0] = cmd-after;
 procs[1] = cmd-after_any;
 break;
 
 So it's BEFORE ANY then BEFORE command then AFTER command then AFTER
 ANY. That's an arbitrary I made and we can easily reconsider. Triggers
 are called in alphabetical order in each “slot” here.
 
 In my mind it makes sense to have ANY triggers around the specific
 triggers, but it's hard to explain why that feels better.
I still think this would be a mistake. I don't have a hard time imagining 
usecases where a specific trigger should be called before or after an ANY 
trigger because e.g. it wants to return a more specific error or doesn't want 
to check all preconditions already done by the ANY trigger... All that would 
be precluded by enforcing a strict ordering between ANY and specific triggers.
I don't see a use-case that would benefit from the current behaviour...

Andres
-- 
Andres Freund   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-23 Thread Dimitri Fontaine
Thom Brown t...@linux.com writes:
 The new command triggers work correctly.

Thanks for your continued testing :)

 Having looked at your regression tests, you don't seem to have enough
 before triggers in the tests.  There's no test for before CREATE
 TABLE, CREATE TABLE AS or SELECT INTO.  In my tests I have 170 unique
 command triggers, but there are only 44 in the regression test.  Is
 there a reason why there aren't many tests?

Now that we share the same code for ANY triggers and specific ones, I
guess we could drop a lot of specific command triggers from the
regression tests.

 A problem still outstanding is that when I build the docs, the CREATE

I would like to get back on code level review now if at all possible,
and I would integrate your suggestions here into the next patch revision
if another one is needed.

The only point yet to address from last round from Andres is about the
API around CommandFiresTrigger() and the Memory Context we use here.
We're missing an explicit Reset call, and to be able to have we need to
have a more complex API, because of the way RemoveObjects() and
RemoveRelations() work.

We would need to add no-reset APIs and an entry point to manually reset
the memory context, which currently gets disposed at the same time as
its parent context, the current one that's been setup before entering
standard_ProcessUtility().

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-03-21 Thread Robert Haas
On Tue, Mar 20, 2012 at 12:02 PM, Andres Freund and...@anarazel.de wrote:
 The attached patch applies from 8.3 to 9.1 (8.2 has conflicts but
 thankfully...).

It seems it doesn't apply to master (any more?).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-03-21 Thread Andres Freund
On Wednesday, March 21, 2012 04:54:00 PM Robert Haas wrote:
 On Tue, Mar 20, 2012 at 12:02 PM, Andres Freund and...@anarazel.de wrote:
  The attached patch applies from 8.3 to 9.1 (8.2 has conflicts but
  thankfully...).
 
 It seems it doesn't apply to master (any more?).
Its not required there because of the unification of CTAS with normal code. 
Sorry, that was only clear from a few mails away, should have made that 
explicit.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-03-21 Thread Robert Haas
On Wed, Mar 21, 2012 at 11:58 AM, Andres Freund and...@anarazel.de wrote:
 On Wednesday, March 21, 2012 04:54:00 PM Robert Haas wrote:
 On Tue, Mar 20, 2012 at 12:02 PM, Andres Freund and...@anarazel.de wrote:
  The attached patch applies from 8.3 to 9.1 (8.2 has conflicts but
  thankfully...).

 It seems it doesn't apply to master (any more?).
 Its not required there because of the unification of CTAS with normal code.
 Sorry, that was only clear from a few mails away, should have made that
 explicit.

Or maybe I should have tested it before jumping to conclusions.

Anyway, I've committed the patch to 9.1, 9.0, 8.4, and 8.3.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-20 Thread Andres Freund
On Tuesday, March 20, 2012 02:39:56 AM Tom Lane wrote:
 I've applied the CTAS patch after rather heavy editorialization.  Don't
 know what consequences that will have for Dimitri's patch.
Thanks for all the work you put into this! Looks cleaner now...

Thanks,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers

2012-03-20 Thread Andres Freund
On Tuesday, February 28, 2012 12:43:02 AM Andres Freund wrote:
 On Tuesday, February 28, 2012 12:30:36 AM Tom Lane wrote:
  Andres Freund and...@anarazel.de writes:
   Sorry for letting this slide.
   
   Is it worth adding this bit to OpenIntoRel? Not sure if there is danger
   in allowing anyone to create shared tables
   
 /* In all cases disallow placing user relations in pg_global */
 if (tablespaceId == GLOBALTABLESPACE_OID)
 
 ereport(ERROR,
 
 (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 
  errmsg(only shared relations can be placed in 
   pg_global
   
   tablespace)));
  
  Ugh ... if that's currently allowed, we definitely need to fix it.
  But I'm not sure OpenIntoRel is the right place.  I'd have expected
  the test to be at some lower level, like heap_create_with_catalog
  or so.
 
 Its definitely allowed right now:
 
 test-upgrade=# CREATE TABLE foo TABLESPACE pg_global AS SELECT 1;
 SELECT 1
 Time: 354.097 ms
 
 The analogous check for the missing one in OpenIntoRel for plain relations
 is in defineRelation. heap_create_with_catalog only contains the inverse
 check:
 
   /*
* Shared relations must be in pg_global (last-ditch check)
*/
   if (shared_relation  reltablespace != GLOBALTABLESPACE_OID)
   elog(ERROR, shared relations must be placed in pg_global
 tablespace);
 
 
 Moving it there sounds like a good idea without any problem I can see right
 now. Want me to prepare a patch or is it just the same for you if you do it
 yourself?
Sorry to bother you with that dreary topic further, but this should probably 
be fixed before the next set of stable releases.

The check cannot easily be moved to heap_create_with_catalog because e.g. 
cluster.c's make_new_heap does heap_create_with_catalog for a temporary copy 
of shared relations without being able to mark them as such (because they 
don't have the right oid and thus IsSharedRelation would cry). So I think just 
adding the same check to the ctas code as the normal DefineRelation contains 
is the best way forward.

The attached patch applies from 8.3 to 9.1 (8.2 has conflicts but 
thankfully...).

Andres
From 77896a7385d1ef5c793e06c5085a8b37ab5857c9 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Tue, 20 Mar 2012 16:33:13 +0100
Subject: [PATCH] Check that the specified tablespace in a CREATE TABLE AS
 command is not pg_global

That check was not added to the CTAS code when it was added to the ordinary
CREATE TABLE AS.

It might be nicer to add that check heap_create_with_catalog as well, but thats
not easily possible because e.g. cluster creates a temporary new heap which
cannot be marked shared because there is a fixed list of shared relations (see
IsSharedRelation).
---
 src/backend/executor/execMain.c |7 +++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 621ad8a..8a43db7 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -43,6 +43,7 @@
 #include access/xact.h
 #include catalog/heap.h
 #include catalog/namespace.h
+#include catalog/pg_tablespace.h
 #include catalog/toasting.h
 #include commands/tablespace.h
 #include commands/trigger.h
@@ -2452,6 +2453,12 @@ OpenIntoRel(QueryDesc *queryDesc)
 		   get_tablespace_name(tablespaceId));
 	}
 
+	/* In all cases disallow placing user relations in pg_global */
+	if (tablespaceId == GLOBALTABLESPACE_OID)
+		ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg(only shared relations can be placed in pg_global tablespace)));
+
 	/* Parse and validate any reloptions */
 	reloptions = transformRelOptions((Datum) 0,
 	 into-options,
-- 
1.7.6.409.ge7a85.dirty


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-20 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 I've applied the CTAS patch after rather heavy editorialization.  Don't
 know what consequences that will have for Dimitri's patch.

It allows my patch to add support for CREATE TABLE AS and SELECT INTO,
I've been doing that and am on my way to sending a v18 now. The way you
worked out the command tag is exactly what I needed, so thanks a lot for
your work comitting this and paying attention :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers patch v18

2012-03-20 Thread Thom Brown
On 20 March 2012 17:49, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Hi,

 I guess I sent v17 a little early considering that we now already have
 v18 including support for CREATE TABLE AS and SELECT INTO, thanks to the
 work of Andres and Tom.

 There was some spurious tags in the sgml files in v17 that I did clean
 up too.

The new command triggers work correctly.

Having looked at your regression tests, you don't seem to have enough
before triggers in the tests.  There's no test for before CREATE
TABLE, CREATE TABLE AS or SELECT INTO.  In my tests I have 170 unique
command triggers, but there are only 44 in the regression test.  Is
there a reason why there aren't many tests?

A problem still outstanding is that when I build the docs, the CREATE
COMMAND TRIGGER is listed after COMMAND TRIGGER in
html/sql-commands.html.  I recall you mentioned you didn't have this
issue on your side.  Can you just confirm this again?  I believe I've
located the cause of this problem.  In doc/src/sgml/reference.sgml the
ALTER/CREATE/DROP COMMAND TRIGGER references are placed below their
respective trigger counterparts.  Putting them back into alphabetical
order corrects the issue.

On the pg_cmdtrigger page in the documentation, I'd recommend the
following changes:

s/Trigger's name/Trigger name/

s/Command TAG/Command tag/

s/The OID of the function that is called by this command trigger./The
OID of the function called by this command trigger./

Also contype should read ctgtype.

Note that I haven't tested pl/Perl, pl/Python or pl/tcl yet.

Regards

-- 
Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Dimitri Fontaine
Thom Brown t...@linux.com writes:
 Will you also be committing the trigger function variable changes
 shortly?  I don't wish to test anything prior to this as that will
 involve a complete re-test from my side anyway.

It's on its way, I had to spend time elsewhere, sorry about that. With
some luck I can post a intermediate patch later this evening limited to
PLpgSQL support for your testing.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Andres Freund
On Sunday, March 18, 2012 07:29:30 PM Tom Lane wrote:
 BTW, I've been looking through how to do what I suggested earlier to get
 rid of the coziness and code duplication between CreateTableAs and the
 prepare.c code; namely, let CreateTableAs create a DestReceiver and then
 call ExecuteQuery with that receiver.  It appears that we still need at
 least two bits of added complexity with that approach:
 
 1. ExecuteQuery still has to know that's a CREATE TABLE AS operation so
 that it can enforce that the prepared query is a SELECT.  (BTW, maybe
 this should be weakened to something that returns tuples, in view of
 RETURNING?)
I don't really see the use case but given the amount of work it probably takes 
it seems reasonable to allow that.

 2. Since ExecuteQuery goes through the Portal machinery, there's no way
 for it to pass in eflags to control the table OIDs setup.  This is
 easily solved by adding an eflags parameter to PortalStart, which
 doesn't seem too awful to me, since the typical caller would just pass
 zero.  (ExecuteQuery would also have to know about testing
 interpretOidsOption to compute the eflags to use, unless we add an
 eflags parameter to it, which might be the cleaner option.)
If we go down this route I think adding an eflag is the better choice. Thinking 
of it - my patch already added that ;)

 In short I'm thinking: add an eflags parameter to PortalStart, and add
 both an eflags parameter and a bool mustReturnTuples parameter to
 ExecuteQuery.  This definitely seems cleaner than the current
 duplication of code.
Hm. I am not *that* convinced anymore. It wasn't that much duplication in the 
end...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Robert Haas
On Sun, Mar 18, 2012 at 2:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 1. ExecuteQuery still has to know that's a CREATE TABLE AS operation so
 that it can enforce that the prepared query is a SELECT.  (BTW, maybe
 this should be weakened to something that returns tuples, in view of
 RETURNING?)

+1 for something that returns with tuples.   CREATE TABLE ... AS
DELETE FROM ... WHERE ... RETURNING ... seems like a cool thing to
support.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Mar 18, 2012 at 2:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 1. ExecuteQuery still has to know that's a CREATE TABLE AS operation so
 that it can enforce that the prepared query is a SELECT.  (BTW, maybe
 this should be weakened to something that returns tuples, in view of
 RETURNING?)

 +1 for something that returns with tuples.   CREATE TABLE ... AS
 DELETE FROM ... WHERE ... RETURNING ... seems like a cool thing to
 support.

For the moment I've backed off that idea.  The main definitional
question we'd have to resolve is whether we want to allow WITH NO DATA,
and if so what does that mean (should the DELETE execute, or not?).
I am also not certain that the RETURNING code paths would cope with
a WITH OIDS specification, and there are some other things that would
need fixed.  It might be cool to do it sometime, but it's not going to
happen in this patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Robert Haas
On Mon, Mar 19, 2012 at 12:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Mar 18, 2012 at 2:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 1. ExecuteQuery still has to know that's a CREATE TABLE AS operation so
 that it can enforce that the prepared query is a SELECT.  (BTW, maybe
 this should be weakened to something that returns tuples, in view of
 RETURNING?)

 +1 for something that returns with tuples.   CREATE TABLE ... AS
 DELETE FROM ... WHERE ... RETURNING ... seems like a cool thing to
 support.

 For the moment I've backed off that idea.  The main definitional
 question we'd have to resolve is whether we want to allow WITH NO DATA,
 and if so what does that mean (should the DELETE execute, or not?).
 I am also not certain that the RETURNING code paths would cope with
 a WITH OIDS specification, and there are some other things that would
 need fixed.  It might be cool to do it sometime, but it's not going to
 happen in this patch.

Fair enough.  It would be nice to have, but it definitely does not
seem worth spending a lot of time on right now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Peter Eisentraut
On sön, 2012-03-18 at 21:16 -0400, Tom Lane wrote:
 If we were going to change the output at all, I would vote for CREATE
 TABLE  so as to preserve the rowcount functionality.  Keep in
 mind though that this would force client-side changes, for instance in
 libpq's PQcmdTuples().  Fixing that one routine isn't so painful, but
 what of other client-side libraries, not to mention applications?

Doesn't seem worth it to me.  At least, SELECT  makes some sense:
 rows were selected.  CREATE TABLE  means what?   tables
were created?

What might make sense is to delegate this additional information to
separate fields in a future protocol revision.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Robert Haas
On Mon, Mar 19, 2012 at 12:53 PM, Peter Eisentraut pete...@gmx.net wrote:
 On sön, 2012-03-18 at 21:16 -0400, Tom Lane wrote:
 If we were going to change the output at all, I would vote for CREATE
 TABLE  so as to preserve the rowcount functionality.  Keep in
 mind though that this would force client-side changes, for instance in
 libpq's PQcmdTuples().  Fixing that one routine isn't so painful, but
 what of other client-side libraries, not to mention applications?

 Doesn't seem worth it to me.  At least, SELECT  makes some sense:
  rows were selected.  CREATE TABLE  means what?   tables
 were created?

 What might make sense is to delegate this additional information to
 separate fields in a future protocol revision.

I think that we would not have bothered to add the row count to the
command tag output for SELECT unless it were useful.  It seems to be
*more* useful for CTAS than for SELECT; after all, SELECT also returns
the actual rows.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Mar 19, 2012 at 12:53 PM, Peter Eisentraut pete...@gmx.net wrote:
 Doesn't seem worth it to me.  At least, SELECT  makes some sense:
  rows were selected.  CREATE TABLE  means what?   tables
 were created?
 
 What might make sense is to delegate this additional information to
 separate fields in a future protocol revision.

 I think that we would not have bothered to add the row count to the
 command tag output for SELECT unless it were useful.  It seems to be
 *more* useful for CTAS than for SELECT; after all, SELECT also returns
 the actual rows.

I think we're all in agreement that we need to keep the rowcount
functionality.  What seems to me to be in some doubt is whether to
continue to present the tag SELECT  or to change it to something
like CREATE TABLE .  For the moment I've got the patch doing the
former.  It would not be terribly hard to change it, but I'm not going
to break backward compatibility unless there's a pretty clear consensus
to do so.


BTW, I just came across another marginal-loss-of-functionality issue:
in previous versions you could PREPARE a SELECT INTO, but now you get

regression=# prepare foo as select * into bar from int8_tbl;
ERROR:  utility statements cannot be prepared

Is anybody excited about that?  If it is something we have to keep,
it seems like pretty nearly a deal-breaker for this patch, because
storing a CreateTableAsStmt containing an already-prepared plan would
complicate matters unreasonably.  You can still get approximately
the same result with

prepare foo as select * from int8_tbl;
create table bar as execute foo;

which if anything is more useful since you didn't nail down the target
table name in the PREPARE, but nonetheless it's different.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Tom Lane
I wrote:
 One thing I soon found is that it lacks support for EXPLAIN SELECT INTO.

While I'm not particularly excited about fixing PREPARE ... SELECT INTO
or CREATE RULE ... SELECT INTO, I've come to the conclusion that the
EXPLAIN case is a must-fix.  Because not only is EXPLAIN SELECT INTO
broken, but so is EXPLAIN CREATE TABLE AS, and the latter functionality
is actually documented.  So presumably somebody went out of their way
to make this work, at some point.

Since I've got the table-creation code moved into intorel_startup,
this doesn't look to be that painful, but it will require an API change
for ExplainOnePlan, which is slightly annoying because that's probably
in use by third-party plugins.  We could either break them obviously
(by adding an explicit parameter) or break them subtly (by adding an
ExplainState field they might forget to initialize).  The former seems
preferable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-19 Thread Tom Lane
I've applied the CTAS patch after rather heavy editorialization.  Don't
know what consequences that will have for Dimitri's patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-18 Thread Tom Lane
BTW, I've been looking through how to do what I suggested earlier to get
rid of the coziness and code duplication between CreateTableAs and the
prepare.c code; namely, let CreateTableAs create a DestReceiver and then
call ExecuteQuery with that receiver.  It appears that we still need at
least two bits of added complexity with that approach:

1. ExecuteQuery still has to know that's a CREATE TABLE AS operation so
that it can enforce that the prepared query is a SELECT.  (BTW, maybe
this should be weakened to something that returns tuples, in view of
RETURNING?)

2. Since ExecuteQuery goes through the Portal machinery, there's no way
for it to pass in eflags to control the table OIDs setup.  This is
easily solved by adding an eflags parameter to PortalStart, which
doesn't seem too awful to me, since the typical caller would just pass
zero.  (ExecuteQuery would also have to know about testing
interpretOidsOption to compute the eflags to use, unless we add an
eflags parameter to it, which might be the cleaner option.)

In short I'm thinking: add an eflags parameter to PortalStart, and add
both an eflags parameter and a bool mustReturnTuples parameter to
ExecuteQuery.  This definitely seems cleaner than the current
duplication of code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-18 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 1. ExecuteQuery still has to know that's a CREATE TABLE AS operation so
 that it can enforce that the prepared query is a SELECT.  (BTW, maybe
 this should be weakened to something that returns tuples, in view of
 RETURNING?)

That lights a bulb: what about rewriting CREATE TABLE AS as two
commands, internally:

 1. CREATE TABLE …
 2. WITH x ( query here ) INSERT INTO … SELECT * FROM x;

It seems like not solving much from a practical point of view though as
you need to be able to parse the output of the subquery before you can
do the first step, but on the other hand it might be that you already
have the pieces to just do that.

Well, I had to share that though, somehow.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-18 Thread Peter Eisentraut
On lör, 2012-03-17 at 18:04 -0400, Tom Lane wrote:
 I'm not sure what we should do instead.  We have gotten push-back before
 anytime we changed the command tag for an existing command (and in fact
 it seems that we intentionally added the rowcount display in 9.0, which
 means there are people out there who care about that functionality).
 On the other hand, the traditional output for CREATE TABLE AS doesn't
 seem to satisfy the principle of least astonishment.  A third
 consideration is that if we are pushing CREATE TABLE AS as the preferred
 spelling, people will probably complain if it omits functionality that
 SELECT INTO provides; so I'm not sure that SELECT n in one case and
 CREATE TABLE AS in the other would be a good idea either.  Any
 opinions what to do here? 

Another consideration is that the SQL command tags are defined by the
SQL standard.  So if we were to change it, then it should be CREATE
TABLE.  I'm not convinced that it should be changed, though.  (I recall
cross-checking our list against the SQL standard in the past, so there
might have been discussion on this already.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-18 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 That lights a bulb: what about rewriting CREATE TABLE AS as two
 commands, internally:

Given the compatibility constraints on issues like what command tag
to return, I think that would probably make our jobs harder not easier.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-18 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On lör, 2012-03-17 at 18:04 -0400, Tom Lane wrote:
 I'm not sure what we should do instead.  We have gotten push-back before
 anytime we changed the command tag for an existing command (and in fact
 it seems that we intentionally added the rowcount display in 9.0, which
 means there are people out there who care about that functionality).
 On the other hand, the traditional output for CREATE TABLE AS doesn't
 seem to satisfy the principle of least astonishment.  A third
 consideration is that if we are pushing CREATE TABLE AS as the preferred
 spelling, people will probably complain if it omits functionality that
 SELECT INTO provides; so I'm not sure that SELECT n in one case and
 CREATE TABLE AS in the other would be a good idea either.  Any
 opinions what to do here? 

 Another consideration is that the SQL command tags are defined by the
 SQL standard.  So if we were to change it, then it should be CREATE
 TABLE.  I'm not convinced that it should be changed, though.  (I recall
 cross-checking our list against the SQL standard in the past, so there
 might have been discussion on this already.)

If we were going to change the output at all, I would vote for CREATE
TABLE  so as to preserve the rowcount functionality.  Keep in mind
though that this would force client-side changes, for instance in
libpq's PQcmdTuples().  Fixing that one routine isn't so painful, but
what of other client-side libraries, not to mention applications?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-17 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On Friday, March 16, 2012 10:52:55 PM Tom Lane wrote:
 Something else I just came across is that there are assorted places that
 are aware that ExplainStmt contains a Query, eg setrefs.c, plancache.c,
 and those have got to treat CreateTableAsStmt similarly.

 Hm. Is that so? As implemented in my version the planner just sees a plain 
 statement instead of a utility statement. Am I missing something?

Well, the work flow for EXPLAIN is:

parse analysis: recursively do parse analysis on contained query
plan: do nothing
execution: call planner on contained query, then optionally run it

and the reason for doing it that way is explained by
transformExplainStmt:

 * EXPLAIN is like other utility statements in that we emit it as a
 * CMD_UTILITY Query node; however, we must first transform the contained
 * query.  We used to postpone that until execution, but it's really necessary
 * to do it during the normal parse analysis phase to ensure that side effects
 * of parser hooks happen at the expected time.

ISTM that argument applies just as much to CREATE TABLE AS, especially
in view of the fact that we're restructuring the SELECT INTO case, in
which parse analysis of the SELECT certainly does happen early.  It's
also not clear to me why it wouldn't apply to COPY (SELECT ...).

I'm not going to touch the COPY (SELECT ...) issue right now, but
somebody ought to go back and check up on the exact user-visible bugs
that motivated moving EXPLAIN's parse analysis processing.  (I suspect
it had to do with plpgsql variable processing, but too lazy to go look
right now.)  If there's a plausible use case where similar bugs could
be exhibited in COPY, we're going to have to restructure that too.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-17 Thread Tom Lane
While looking at this I also noticed that DECLARE CURSOR uses a
structure that's randomly different in yet a third way: we start
with a utility statement containing a query, and then flip that
upside down so that the SELECT Query contains a utility statement!

I have a vague feeling that I'm the one who's guilty of that hack, too.

I'm not sure that anybody cares about being able to fire command
triggers on DECLARE CURSOR, but just from a consistency standpoint it
would make sense for this to work more like EXPLAIN and CREATE TABLE AS.
So that convinces me that UtilityContainsQuery() would be a good thing,
and I'll add that to the patch.  (Actually changing DECLARE CURSOR seems
like a separate patch, though.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-17 Thread Andres Freund
On Saturday, March 17, 2012 06:45:27 PM Tom Lane wrote:
 I'm not sure that anybody cares about being able to fire command
 triggers on DECLARE CURSOR
I actually think it would make sense to explicitly not fire command triggers 
there given that DECLARE CURSOR actually potentially is somewhat performance 
relevant.

 , but just from a consistency standpoint it
 would make sense for this to work more like EXPLAIN and CREATE TABLE AS.
 So that convinces me that UtilityContainsQuery() would be a good thing,
 and I'll add that to the patch.  (Actually changing DECLARE CURSOR seems
 like a separate patch, though.)
Aggreed on that.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-17 Thread Tom Lane
I've found a couple more issues in the CTAS patch:

1. Previous versions delivered a SELECT n command tag for either
spelling of the command:

regression=# select * into t1 from int8_tbl;
SELECT 6
regression=# create table t2 as select * from int8_tbl;
SELECT 6

With the patch I get

regression=# select * into t1 from int8_tbl;
SELECT 0 0
regression=# create table t2 as select * from int8_tbl;
CREATE TABLE AS

The first of these is particularly unfortunate since it's outright lying
as to the number of rows processed.

I'm not sure what we should do instead.  We have gotten push-back before
anytime we changed the command tag for an existing command (and in fact
it seems that we intentionally added the rowcount display in 9.0, which
means there are people out there who care about that functionality).
On the other hand, the traditional output for CREATE TABLE AS doesn't
seem to satisfy the principle of least astonishment.  A third
consideration is that if we are pushing CREATE TABLE AS as the preferred
spelling, people will probably complain if it omits functionality that
SELECT INTO provides; so I'm not sure that SELECT n in one case and
CREATE TABLE AS in the other would be a good idea either.  Any
opinions what to do here?

2. Historically, CREATE RULE has allowed a rule action to be SELECT INTO
(though not CREATE TABLE AS).  Currently the patch is throwing an error
for that.  This seems like something that might not be worth fixing,
though.  It's fairly hard to conceive of a use-case for such a rule,
since it would work only once before starting to throw table already
exists errors.  How much do we care about preserving backward
compatibility here?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, patch v11

2012-03-17 Thread Andres Freund
On Saturday, March 17, 2012 11:04:30 PM Tom Lane wrote:
 I've found a couple more issues in the CTAS patch:
 
 1. Previous versions delivered a SELECT n command tag for either
 spelling of the command:
 
 regression=# select * into t1 from int8_tbl;
 SELECT 6
 regression=# create table t2 as select * from int8_tbl;
 SELECT 6
 
 With the patch I get
 
 regression=# select * into t1 from int8_tbl;
 SELECT 0 0
hm. Stupid me.

 regression=# create table t2 as select * from int8_tbl;
 CREATE TABLE AS
 
 The first of these is particularly unfortunate since it's outright lying
 as to the number of rows processed.
 I'm not sure what we should do instead.  We have gotten push-back before
 anytime we changed the command tag for an existing command (and in fact
 it seems that we intentionally added the rowcount display in 9.0, which
 means there are people out there who care about that functionality).
 On the other hand, the traditional output for CREATE TABLE AS doesn't
 seem to satisfy the principle of least astonishment.  A third
 consideration is that if we are pushing CREATE TABLE AS as the preferred
 spelling, people will probably complain if it omits functionality that
 SELECT INTO provides; so I'm not sure that SELECT n in one case and
 CREATE TABLE AS in the other would be a good idea either.  Any
 opinions what to do here?
I would prefer both returning CREATE TABLE AS since thats what actually 
happens. We already document that SELECT INTO is kinda deprecated: The 
PostgreSQL usage of SELECT INTO to represent table creation is historical. It 
is best to use CREATE TABLE AS for this purpose in new code.
I have seen code that uses the command code for selecting the, app level, 
logging. Its kinda hard to do that if a CREATE TABLE AS/SELECT INTO returns 
SELECT.

Does CTAS ommit any functionality currently? I don't see any reason not to 
support stuff there thats supported in SELECT INTO.

 2. Historically, CREATE RULE has allowed a rule action to be SELECT INTO
 (though not CREATE TABLE AS).  Currently the patch is throwing an error
 for that.  This seems like something that might not be worth fixing,
 though.  It's fairly hard to conceive of a use-case for such a rule,
 since it would work only once before starting to throw table already
 exists errors.  How much do we care about preserving backward
 compatibility here?
I vote for not supporting that anymore. That being possible looks more like an 
implementation detail to me.


Thanks for looking at this!

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Andres Freund
On Thursday, March 15, 2012 11:41:21 PM Thom Brown wrote:
 On 15 March 2012 22:06, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
  Dimitri Fontaine dimi...@2ndquadrant.fr writes:
  At this moment in time, CTAS is still outstanding.  Is the plan to try
  to get that in for this release, or as an enhancement in 9.3?
  
  The plan is to get CTAS as a utility command in 9.2 then update the
  command trigger patch to benefit from the new situation. We've been
  wondering about making its own commit fest entry for that patch, it's
  now clear in my mind, that needs to happen.
  
   https://commitfest.postgresql.org/action/patch_view?id=823
 
 Looks like the ctas-on-command-triggers-01.patch patch needs re-basing.
I can do that - but imo the other patch (not based on the command triggers 
stuff) is the relevant for now as this patch ought to be applied before 
command triggers. It doesn't seem to make too much sense to rebase it 
frequently as long as the command triggers patch isn't stable...

Any reason you would prefer it being rebased?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Christian Ullrich

* Thom Brown wrote:


I don’t understand how functions can return a type of “command
trigger”.  This certainly works, but I’ve never seen a type consisting
of more than one word.  Could you explain this for me?  This is also


postgres= with types (name) as
postgres- (select format_type(oid, NULL) from pg_type)
postgres- select name from types where name like '% %'
postgres- and not name like '%[]';
name
-
 double precision
 character varying
 time without time zone
 timestamp without time zone
 timestamp with time zone
 time with time zone
 bit varying
(7 Zeilen)

I think these are all specially handled in the parser.

--
Christian Ullrich




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Thom Brown
On 16 March 2012 08:13, Andres Freund and...@anarazel.de wrote:
 On Thursday, March 15, 2012 11:41:21 PM Thom Brown wrote:
 Looks like the ctas-on-command-triggers-01.patch patch needs re-basing.
 I can do that - but imo the other patch (not based on the command triggers
 stuff) is the relevant for now as this patch ought to be applied before
 command triggers. It doesn't seem to make too much sense to rebase it
 frequently as long as the command triggers patch isn't stable...

 Any reason you would prefer it being rebased?

Using latest Git master without any additional patches, I can't get it to apply:

Hunk #1 FAILED at 16.
Hunk #2 succeeded at 22 (offset -1 lines).
1 out of 2 hunks FAILED -- saving rejects to file
src/include/commands/tablecmds.h.rej

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Andres Freund
Hi,

On Thursday, March 15, 2012 10:58:49 PM Dimitri Fontaine wrote:
 I tricked that in the grammar, the type is called cmdtrigger but I
 though it wouldn't be a good choice for the SQL statement.
Hm. I am decidedly unhappy with that grammar hackery... But then maybe I am 
squeamish.

 + oid  |  typname   | oid  |  proname   
 +--++--+
 + 1790 | refcursor  |   46 | textin
 + 3838 | cmdtrigger | 2300 | trigger_in
 +(2 rows)
Hm. Wonder if its a good idea to reuse trigger_in. So far we have duplicated 
functions for that.

 @@ -482,12 +494,21 @@ ListCommandTriggers(CommandContext cmd)
 
 switch (form-ctgtype)
 {
 
 case CMD_TRIGGER_FIRED_BEFORE:
 -   cmd-before = lappend_oid(cmd-before, form-ctgfoid);
 +   {
 +   if (list_any_triggers)
 +   cmd-before_any = lappend_oid(cmd-before_any,
 form-ctgfoid); +   else
 +   cmd-before = lappend_oid(cmd-before, form-ctgfoid);
 
 break;
 
 -
 ...
 +   case CMD_TRIGGER_FIRED_BEFORE:
 +   {
 +   whenstr = BEFORE;
 +
 +   foreach(cell, cmd-before_any)
 +   {
 +   Oid proc = lfirst_oid(cell);
 +
 +   call_cmdtrigger_procedure(cmd, (RegProcedure)proc,
 whenstr); +   }
 +   foreach(cell, cmd-before)
 +   {
 +   Oid proc = lfirst_oid(cell);
 +
 +   call_cmdtrigger_procedure(cmd, (RegProcedure)proc,
 whenstr); +   }
 +   break;
 +   }
This will have the effect of calling triggers outside of alphabetic order. I 
don't think thats a good idea even if one part is ANY and the other a specific 
command.
I don't think there is any reason anymore to separate the two? The only 
callsite seems to look like:

632-default:
633:ListCommandTriggers(cmd, true);   /* list ANY command 
triggers */
634:ListCommandTriggers(cmd, false);  /* and triggers for 
this 
command tag */

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Andres Freund
On Friday, March 16, 2012 09:30:58 AM Thom Brown wrote:
 On 16 March 2012 08:13, Andres Freund and...@anarazel.de wrote:
  On Thursday, March 15, 2012 11:41:21 PM Thom Brown wrote:
  Looks like the ctas-on-command-triggers-01.patch patch needs re-basing.
  
  I can do that - but imo the other patch (not based on the command
  triggers stuff) is the relevant for now as this patch ought to be
  applied before command triggers. It doesn't seem to make too much sense
  to rebase it frequently as long as the command triggers patch isn't
  stable...
  
  Any reason you would prefer it being rebased?
 
 Using latest Git master without any additional patches, I can't get it to
 apply:
 
 Hunk #1 FAILED at 16.
 Hunk #2 succeeded at 22 (offset -1 lines).
 1 out of 2 hunks FAILED -- saving rejects to file
 src/include/commands/tablecmds.h.rej
Did you read the paragraph above?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Thom Brown
On 15 March 2012 21:58, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Thom Brown thombr...@gmail.com writes:
 I don’t understand how functions can return a type of “command
 trigger”.  This certainly works, but I’ve never seen a type consisting
 of more than one word.  Could you explain this for me?  This is also

 I tricked that in the grammar, the type is called cmdtrigger but I
 though it wouldn't be a good choice for the SQL statement.

Christian sent me a message mentioning that we've pretty much always
had data types consisting of more than one word (e.g.  timestamp
without time zone).  So I completely retract my question as it's
obviously nonsense.

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Thom Brown
On 16 March 2012 08:45, Andres Freund and...@anarazel.de wrote:
 On Friday, March 16, 2012 09:30:58 AM Thom Brown wrote:
 On 16 March 2012 08:13, Andres Freund and...@anarazel.de wrote:
  On Thursday, March 15, 2012 11:41:21 PM Thom Brown wrote:
  Looks like the ctas-on-command-triggers-01.patch patch needs re-basing.
 
  I can do that - but imo the other patch (not based on the command
  triggers stuff) is the relevant for now as this patch ought to be
  applied before command triggers. It doesn't seem to make too much sense
  to rebase it frequently as long as the command triggers patch isn't
  stable...
 
  Any reason you would prefer it being rebased?

 Using latest Git master without any additional patches, I can't get it to
 apply:

 Hunk #1 FAILED at 16.
 Hunk #2 succeeded at 22 (offset -1 lines).
 1 out of 2 hunks FAILED -- saving rejects to file
 src/include/commands/tablecmds.h.rej
 Did you read the paragraph above?

Yes, but I don't think I'm clear on what you mean.  Are you saying I
should use ctas-01.patch instead of ctas-on-command-triggers-01.patch?
 If so, that patch results in me not being able to apply Dimitri's
command triggers patch.  And I thought that patch doesn't actually
cause triggers to fire on CTAS?

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Andres Freund
On Friday, March 16, 2012 09:55:10 AM Thom Brown wrote:
 On 16 March 2012 08:45, Andres Freund and...@anarazel.de wrote:
  On Friday, March 16, 2012 09:30:58 AM Thom Brown wrote:
  On 16 March 2012 08:13, Andres Freund and...@anarazel.de wrote:
   On Thursday, March 15, 2012 11:41:21 PM Thom Brown wrote:
   Looks like the ctas-on-command-triggers-01.patch patch needs
   re-basing.
   
   I can do that - but imo the other patch (not based on the command
   triggers stuff) is the relevant for now as this patch ought to be
   applied before command triggers. It doesn't seem to make too much
   sense to rebase it frequently as long as the command triggers patch
   isn't stable...
   
   Any reason you would prefer it being rebased?
  
  Using latest Git master without any additional patches, I can't get it
  to apply:
  
  Hunk #1 FAILED at 16.
  Hunk #2 succeeded at 22 (offset -1 lines).
  1 out of 2 hunks FAILED -- saving rejects to file
  src/include/commands/tablecmds.h.rej
  
  Did you read the paragraph above?
 
 Yes, but I don't think I'm clear on what you mean.  Are you saying I
 should use ctas-01.patch instead of ctas-on-command-triggers-01.patch?
  If so, that patch results in me not being able to apply Dimitri's
 command triggers patch.  And I thought that patch doesn't actually
 cause triggers to fire on CTAS?
Well. Why do you want to apply them concurrently? As far as I understand the 
plan is to get ctas-as-utility merged with master and then let dim rebase the 
ddl trigger patch.
The ctas-as-utility stuff imo is worthy of being applied independently of DDL 
triggers. The current duplication in the code lead to multiple bugs already.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Dimitri Fontaine
Andres Freund and...@anarazel.de writes:
 On Thursday, March 15, 2012 10:58:49 PM Dimitri Fontaine wrote:
 I tricked that in the grammar, the type is called cmdtrigger but I
 though it wouldn't be a good choice for the SQL statement.
 Hm. I am decidedly unhappy with that grammar hackery... But then maybe I am
 squeamish.

It's easy to remove that hack and get back to having the user visible
type be cmdtrigger, but as this type only serves as a marker for PL
compiling function (validate handler) I though having a user friendly
name was important here.

 + oid  |  typname   | oid  |  proname
 +--++--+
 + 1790 | refcursor  |   46 | textin
 + 3838 | cmdtrigger | 2300 | trigger_in
 +(2 rows)
 Hm. Wonder if its a good idea to reuse trigger_in. So far we have duplicated
 functions for that.

Again, if we think the use case warrants duplicating code rather than
playing with the type definition, I will do that.

 This will have the effect of calling triggers outside of alphabetic order. I
 don't think thats a good idea even if one part is ANY and the other a specific
 command.
 I don't think there is any reason anymore to separate the two? The only
 callsite seems to look like:

The idea is to have a predictable ordering of command triggers. The code
changed in the patch v16 (you pasted code from git in between v15 and
v16, I cleaned it up) and is now easier to read:

case CMD_TRIGGER_FIRED_BEFORE:
whenstr = BEFORE;
procs[0] = cmd-before_any;
procs[1] = cmd-before;
break;

case CMD_TRIGGER_FIRED_AFTER:
whenstr = AFTER;
procs[0] = cmd-after;
procs[1] = cmd-after_any;
break;

So it's BEFORE ANY then BEFORE command then AFTER command then AFTER
ANY. That's an arbitrary I made and we can easily reconsider. Triggers
are called in alphabetical order in each “slot” here.

In my mind it makes sense to have ANY triggers around the specific
triggers, but it's hard to explain why that feels better.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Dimitri Fontaine
Thom Brown thombr...@gmail.com writes:
 Note: incremental patch attached for the following section...

Applied, thanks!

 I don’t know if this was a problem before that I didn’t spot
 (probably), but triggers for both ANY COMMAND and ALTER FOREIGN TABLE
 show a command tag of ALTER TABLE for ALTER FOREIGN TABLE statements
 where the column is renamed:

 I don’t think this is the fault of the trigger code because it
 actually says ALTER TABLE at the bottom, suggesting it’s something
 already present.  This isn’t the case when adding or dropping columns.
  Any comments?

We're building command trigger on top of the existing code. From the
grammar we can see that an alter table and an alter foreign table are
processed as the same command.

AlterForeignTableStmt:
ALTER FOREIGN TABLE relation_expr alter_table_cmds
{
AlterTableStmt *n = makeNode(AlterTableStmt);

So while I think we could want to revisit that choice down the road, I
don't think that's for the command triggers patch to do it.

 Altering the properties of a function (such as cost, security definer,
 whether it’s stable etc) doesn’t report the function’s OID:

That's now fixed. I've checked that all the other places where we're
saying objectId = InvalidOid are related to before create or after drop
commands.

 I get a garbage objectname for AFTER ANY COMMAND triggers on ALTER
 TEXT SEARCH DICTIONARY when changing its options.  It doesn’t show it
 in the below example because I can’t get it displaying in plain text,
 but where the objectname is blank is where I’m seeing unicode a square
 containing “0074” 63 times in a row:

I couldn't reproduce, but I've spotted the problem in the source, and
fixed it. I could find a couple other places that should have been using
pstrdup(NameStr(…)) too, and fixed them. I added a test.

 Specific command triggers on ALTER VIEW don’t work at all:

Can't reproduce, and that's already part of the regression tests.

 Command triggers that fire for CREATE RULE show a schema, but DROP
 RULE doesn’t.  Which is it?:

Oh, both RULE and TRIGGER are not qualified, and I was filling the
schemaname with the schema of the relation they refer to in the CREATE
command, and had DROP correctly handling the TRIGGER case.

That's now fixed, schemaname is NULL in all cases here.

You can read the changes here:

  https://github.com/dimitri/postgres/compare/5e8e37922d...144d870162

And I've been attaching an incremental patch too. Next patch revision is
expected later today with support for plpython, plperl and pltcl.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index f5f2079..5f84751 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -3933,20 +3933,19 @@ SELECT * FROM sales_summary_bytime;
titleTriggers on commands/title
 
para
-applicationPL/pgSQL/application can be used to define trigger
-procedures. A trigger procedure is created with the
+applicationPL/pgSQL/application can be used to define command
+trigger procedures. A command trigger procedure is created with the
 commandCREATE FUNCTION/ command, declaring it as a function with
 no arguments and a return type of typecommand trigger/type.
/para
 
   para
When a applicationPL/pgSQL/application function is called as a
-   trigger, several special variables are created automatically in the
-   top-level block. They are:
+   command trigger, several special variables are created automatically
+   in the top-level block. They are:
 
variablelist
 varlistentry
-varlistentry
  termvarnameTG_TAG/varname/term
  listitem
   para
@@ -4002,7 +4001,7 @@ SELECT * FROM sales_summary_bytime;
   /para
 
   para
-   The command trigger function return's value is not used.
+   The command trigger function's return value is not used.
   /para
 
para
@@ -4014,23 +4013,20 @@ SELECT * FROM sales_summary_bytime;
 titleA applicationPL/pgSQL/application Command Trigger Procedure/title
 
 para
- This example trigger just raise a literalNOTICE/literal message
+ This example trigger simply raises a literalNOTICE/literal message
  each time a supported command is executed.
 /para
 
 programlisting
-create or replace function snitch()
- returns command trigger
- language plpgsql
-as $$
-begin
-  raise notice 'snitch: % % %.% [%]',
+CREATE OR REPLACE FUNCTION snitch() RETURNS command trigger AS $$
+BEGIN
+RAISE NOTICE 'snitch: % % %.% [%]',
tg_when, tg_tag, tg_schemaname, tg_objectname, tg_objectid;
-end;
-$$;
+END;
+$$ LANGUAGE plpgsql;
 
-create command trigger snitch_before before any command execute procedure any_snitch();
-create command trigger snitch_after after any command execute procedure any_snitch();
+CREATE COMMAND TRIGGER snitch_before BEFORE ANY COMMAND EXECUTE PROCEDURE snitch();
+CREATE COMMAND 

Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Thom Brown
On 16 March 2012 11:42, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Thom Brown thombr...@gmail.com writes:
 Specific command triggers on ALTER VIEW don’t work at all:

 Can't reproduce, and that's already part of the regression tests.

This was a problem my side (a mistake I made previously) as I hadn't
added this particular one into my list of created command triggers.  I
had then seen the triggers fire, but forgot to go back and remove my
statement from the draft email.  Apologies.

 And I've been attaching an incremental patch too. Next patch revision is
 expected later today with support for plpython, plperl and pltcl.

Okay, I shalln't do any more testing until the next patch.  I should
probably have worked on automating my tests more, but never got round
to it.

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Dimitri Fontaine
Thom Brown thombr...@gmail.com writes:
 Okay, I shalln't do any more testing until the next patch.  I should
 probably have worked on automating my tests more, but never got round
 to it.

  make installcheck :)

That said, your test allow to spot OID problems that we can't add in the
regression tests (OID being too volatile would break them), and I have
to look at how to add regression tests for the other pls support…

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Thom Brown
On 16 March 2012 12:07, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Thom Brown thombr...@gmail.com writes:
 Okay, I shalln't do any more testing until the next patch.  I should
 probably have worked on automating my tests more, but never got round
 to it.

  make installcheck :)

 That said, your test allow to spot OID problems that we can't add in the
 regression tests (OID being too volatile would break them), and I have
 to look at how to add regression tests for the other pls support…

Yes, that's why I don't use the regression tests. :)

Thom

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On Thursday, March 15, 2012 10:58:49 PM Dimitri Fontaine wrote:
 I tricked that in the grammar, the type is called cmdtrigger but I
 though it wouldn't be a good choice for the SQL statement.

 Hm. I am decidedly unhappy with that grammar hackery... But then maybe I am 
 squeamish.

Multi-word type names are a serious pain in the ass; they require
hackery in a lot of places.  We support the ones that the SQL spec
requires us to, but I will object in the strongest terms to inventing
any that are not required by spec.  I object in even stronger terms to
the incredibly klugy way you did it here.

If you think cmdtrigger isn't a good name maybe you should have
picked a different one to start with.

While I'm looking at the grammar ... it also seems like a serious
PITA from a maintenance standpoint that we're now going to have to
adjust the CREATE COMMAND TRIGGER productions every time somebody
thinks of a new SQL command.  Maybe we should drop this whole idea
of specifying which commands a trigger acts on at the SQL level,
and just have one-size-fits-all command triggers.  Or perhaps have
the selection be on the basis of strings that are matched to command
tags, instead of grammar constructs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Andres Freund
On Friday, March 16, 2012 02:50:55 PM Tom Lane wrote:
 While I'm looking at the grammar ... it also seems like a serious
 PITA from a maintenance standpoint that we're now going to have to
 adjust the CREATE COMMAND TRIGGER productions every time somebody
 thinks of a new SQL command. 
I don't find that argument really convincing. The current state of the patch  
will often require attention to command triggers anyway...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Command Triggers, v16

2012-03-16 Thread Tom Lane
Andres Freund and...@anarazel.de writes:
 On Friday, March 16, 2012 02:50:55 PM Tom Lane wrote:
 While I'm looking at the grammar ... it also seems like a serious
 PITA from a maintenance standpoint that we're now going to have to
 adjust the CREATE COMMAND TRIGGER productions every time somebody
 thinks of a new SQL command. 

 I don't find that argument really convincing.

Well, how about just plain parser size bloat?  Did anyone look at
how much bigger gram.o becomes with this?  Bigger parser - slower,
for everybody, whether they care about this feature or not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   4   >