On Fri, Jan 18, 2013 at 9:07 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> On 2013-01-17 22:39:18 -0500, Robert Haas wrote:
>> On Thu, Jan 17, 2013 at 8:33 PM, Andres Freund <and...@2ndquadrant.com> 
>> wrote:
>> > I have no problem requiring C code to use the even data, be it via hooks
>> > or via C functions called from event triggers. The problem I have with
>> > putting in some hooks is that I doubt that you can find sensible spots
>> > with enough information to actually recreate the DDL for a remote system
>> > without doing most of the work for command triggers.
>>
>> It should be noted that the point of KaiGai's work over the last three
>> years has been to solve exactly this problem.  Well, KaiGai wants to
>> check security rather than do replication, but he wants to be able
>> grovel through the entire node tree and make security decisions based
>> on stuff core PG doesn't care about, so in effect the requirements are
>> identical.  Calling the facility "event triggers" rather than "object
>> access hooks" doesn't make the underlying problem any easier to solve.
>>  I actually believe that the object access hook stuff is getting
>> pretty close to a usable solution if you don't mind coding in C, but
>> I've had trouble convincing anyone else of that.
>>
>> I find it a shame that it hasn't been taken more seriously, because it
>> really does solve the same problem.  sepgsql, for example, has no
>> trouble at all checking permissions for dropped objects.  You can't
>> call procedural code from the spot where we've got that hook, but you
>> sure can call C code, with the usual contract that if it breaks you
>> get to keep both pieces.  The CREATE stuff works fine too.  Support
>> for ALTER is not all there yet, but that's because it's a hard
>> problem.
>
> I don't have a problem reusing the object access infrastructure at all. I just
> don't think its providing even remotely enough. You have (co-)written that
> stuff, so you probably know more than I do, but could you explain to me how it
> could be reused to replicate a CREATE TABLE?
>
> Problems I see:
> - afaics for CREATE TABLE the only hook is in ATExecAddColumn

No, there's one also in heap_create_with_catalog.  Took me a minute to
find it, as it does not use InvokeObjectAccessHook.  The idea is that
OAT_POST_CREATE fires once per object creation, regardless of the
object type - table, column, whatever.

> - no access to the CreateStm, making it impossible to decipher whether e.g. a
>   sequence was created as part of this or not

Yep, that's a problem.  We could of course add additional hook sites
with relevant context information - that's what this infrastructure is
supposed to allow for.

> - No way to regenerate the table definition for execution on the remote system
>   without creating libpqdump.

IMHO, that is one of the really ugly problems that we haven't come up
with a good solution for yet.  If you want to replicate DDL, you have
basically three choices:

1. copy over the statement text that was used on the origin server and
hope none of the corner cases bite you
2. come up with some way of reconstituting a DDL statement based on
(a) the parse tree or (b) what the server actually decided to do
3. reconstitute the state of the object from the catalogs after the
command has run

(2a) differs from (2b) for things like CREATE INDEX, where the index
name might be left for the server to determine, but when replicating
you'd like to get the same name out.  (3) is workable for CREATE but
not ALTER or DROP.

The basic problem here is that (1) and (3) are not very
reliable/complete and (2) is a lot of work and introduces a huge code
maintenance burden.  But it's unfair to pin that on the object-access
hook mechanism - any reverse-parsing or catalog-deconstruction
solution for DDL is going to have that problem.  The decision we have
to make as a community is whether we're prepared to support and
maintain that code for the indefinite future.  Although I think it's
easy to say "yes, because DDL replication would be really cool" - and
I sure agree with that - I think it needs to be thought through a bit
more deeply than that.

I have been involved in PostgreSQL development for about 4.5 years
now.  This is less time than many people here, but it's still long
enough to say a whole lot of people ask for some variant of this idea,
and yet I have yet to see anybody produce a complete, working version
of this functionality and maintain it outside of the PostgreSQL tree
for one release cycle (did I miss something?).  If we pull that
functionality into core, that's just a way of taking work that
nobody's been willing to do and force the responsibility to be spread
across the whole development group.  Now, sometimes it is OK to do
that, but sometimes it means that we're just bolting more things onto
the already-long list of reasons for which patches can be rejected.
Anybody who is now feeling uncomfortable at the prospect of not having
that facility in core ought to think about how they'll feel about,
say, one additional patch per year not getting committed in every
future release because of this requirement.  Does that feel OK?  What
if the number is two, or three?  What's the tipping point where you'd
say the cost is too high?  (Note: If anyone reading this is tempted to
answer that there is no such tipping point, then you have got a bad
case of patch myopia.)

There's a totally legitimate debate to be had here, but my feeling
about what you're calling libpqdump is:

- It's completely separate from event triggers.
- It's completely separate from object access hooks.
- The fact that noone to my knowledge has maintained such a thing
outside of core, and that it seems like a hard project with lots of
required maintenance, makes me very wary about pushing it into core.

[ BTW: Sorry our IM session got cut off.  It started erroring out
every time I messaged you.  But no problem, anyway. ]

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to