On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> [ patches ]

Having now had a little bit of opportunity to reflect on the State Of
This Patch, I'd like to step back from the minutia upon which I've
been commenting in my previous emails and articulate three high-level
concerns about this patch.  In so doing, I would like to specifically
request that other folks on this mailing list comment on the extent to
which they do or do not believe these concerns to be valid.  I believe
I've mentioned all of these concerns at least to some degree
previously, but they've been mixed in with other things, so I want to
take this opportunity to call them out more clearly.

1. How safe is it to try to do decoding inside of a regular backend?
What we're doing here is entering a special mode where we forbid the
use of regular snapshots in favor of requiring the use of "decoding
snapshots", and forbid access to non-catalog relations.  We then run
through the decoding process; and then exit back into regular mode.
On entering and on exiting this special mode, we
InvalidateSystemCaches().  I don't see a big problem with having
special backends (e.g. walsender) use this special mode, but I'm less
convinced that it's wise to try to set things up so that we can switch
back and forth between decoding mode and regular mode in a single
backend.  I worry that won't end up working out very cleanly, and I
think the prohibition against using this special mode in an
XID-bearing transaction is merely a small downpayment on future pain
in this area.  That having been said, I can't pretend at this point
either to understand the genesis of this particular restriction or
what other problems are likely to crop up in trying to allow this
mode-switching.  So it's possible that I'm overblowing it, but it's
makin' me nervous.

2. I think the snapshot-export code is fundamentally misdesigned.  As
I said before, the idea that we're going to export one single snapshot
at one particular point in time strikes me as extremely short-sighted.
 For example, consider one-to-many replication where clients may join
or depart the replication group at any time.  Whenever somebody joins,
we just want a <snapshot, LSN> pair such that they can apply all
changes after the LSN except for XIDs that would have been visible to
the snapshot.  And in fact, we don't even need any special machinery
for that; the client can just make a connection and *take a snapshot*
once decoding is initialized enough.  This code is going to great
pains to be able to export a snapshot at the precise point when all
transactions that were running in the first xl_running_xacts record
seen after the start of decoding have ended, but there's nothing
magical about that point, except that it's the first point at which a
freshly-taken snapshot is guaranteed to be good enough to establish an
initial state for any table in the database.

But do you really want to keep that snapshot around long enough to
copy the entire database?  I bet you don't: if the database is big,
holding back xmin for long enough to copy the whole thing isn't likely
to be fun.  You might well want to copy one table at a time, with
progressively newer snapshots, and apply to each table only those
transactions that weren't part of the initial snapshot for that table.
 Many other patterns are possible.  What you've got baked in here
right now is suitable only for the simplest imaginable case, and yet
we're paying a substantial price in implementation complexity for it.
Frankly, this code is *ugly*; the fact that SnapBuildExportSnapshot()
needs to start a transaction so that it can push out a snapshot.  I
think that's a pretty awful abuse of the transaction machinery, and
the whole point of it, AFAICS, is to eliminate flexibility that we'd
have with simpler approaches.

3. As this feature is proposed, the only plugin we'll ship with 9.4 is
a test_decoding plugin which, as its own documentation says, "doesn't
do anything especially useful."  What exactly do we gain by forcing
users who want to make use of these new capabilities to write C code?
You previously stated that it wasn't possible (or there wasn't time)
to write something generic, but how hard is it, really?  Sure, people
who are hard-core should have the option to write C code, and I'm
happy that they do.  But that shouldn't, IMHO anyway, be a requirement
to use that feature, and I'm having trouble understanding why we're
making it one.  The test_decoding plugin doesn't seem tremendously
much simpler than something that someone could actually use, so why
not make that the goal?


Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to