Re: [HACKERS] logical changeset generation v6.2

Andres Freund Tue, 15 Oct 2013 06:19:00 -0700

On 2013-10-15 08:42:20 -0400, Robert Haas wrote:
> On Mon, Oct 14, 2013 at 9:51 AM, Andres Freund <and...@2ndquadrant.com> wrote:
> >> Well, I just think relying on specific symbol names in the .so file is
> >> kind of unfortunate.  It means that, for example, you can't have
> >> multiple output plugins provided by a single .so.  And in general I
> >> think it's something that we've tried to minimize.
> >
> > But that's not really different when you rely on _PG_init doing it's
> > thing, right?
> 
> Sure, that's true.  But in general I think magic symbol names aren't a
> particularly good design.


It allows you to use the shared libary both as a normal extension loaded
via shared_preload_library or adhoc and as an output plugin which seems
like a sensible goal.
We could have a single _PG_init_output_plugin() symbol that fills in
such a struct which would then not conflict with using the .so
independently. If you prefer that I'll change things around.

We can't do something like 'output_plugin_in_progress' before calling
_PG_init() because _PG_init() won't be called again if the shared object
is already loaded...

> >> But there's only so much information available here.  Why not just
> >> have a format that logs it all?
> >
> > Because we do not know what "all" is? Also, how would we handle
> > replication sets and such that all of the existing replication solutions
> > have generically?
> 
> I don't see how you can fail to know what "all" is.  There's only a
> certain set of facts available.  I mean you could log irrelevant crap
> like a random number that you just picked or the sum of all numeric
> values in the column, but nobody's likely to want that.  What people
> are going to want is the operation performed (insert, update, or
> delete), all the values in the new tuple, the key values from the old
> tuple, the transaction ID, and maybe some meta-information about the
> transaction (such as the commit timestamp).

Some will want all column names included because that makes replication
into different schemas/databases easier, others won't because it makes
replicating the data more complicated and expensive.
Lots will want the primary key as a separate set of columns even for
inserts, others not.
There's also datatypes of values and null representation.

> What I'd probably do is
> emit the data in CSV format, with the first column of each line being
> a single character indicating what sort of row this is: H means a
> header row, defining the format of subsequent rows
> (H,table_name,new_column1,...,new_columnj,old_key_column1,...,old_key_columnk;
> a new header row is emitted only when the column list changes); I, U,
> or D means an insert, update, or delete, with column 2 being the
> transaction ID, column 3 being the table name, and the remaining
> columns matching the last header row for emitted for that table, T
> means meta-information about a transaction, whatever we have (e.g.
> T,txn_id,commit_time).

There's two issues I have with this:
a) CSV seems like a bad format for this. If a transaction inserts into
multiple tables the number of columns will constantly change. Many CSV
parsers don't deal with that all too gracefully. E.g. you can't even
load the data into another postgres database as an audit log.

If we go for CSV I think we should put the entire primary key as one
column (containing all the columns) and the entire row another.

We also don't have any nice facilities for actually writing CSV - so
we'll need to start extracting escaping code from COPY. In the end all
that will make the output plugin very hard to use as an example because
the code will get more complicated.

b) Emitting new row descriptors everytime the schema changes will
require keeping track of the schema. I think that won't be trivial. It
also makes consumption of the data more complicated in comparison to
including the description with every row.

Both are even more true once we extend the format to support streaming
of transactions while they are performed.

> But regular people don't want to write C code; they just want to write
> a crappy Perl script.  And I think we can facilitate that without too
> much work.

I think the generic output plugin should be a separate one from the
example one (which is the one included in the patchset).

> >> Oh, yuck.  So that means you have to write an extra WAL record for
> >> EVERY heap insert, update, or delete to a catalog table?  OUCH.
> >
> > Yes. We could integrate it into the main record without too many
> > problems, but it didn't seem like an important optimization and it would
> > have higher chances of slowing down wal_level < logical.
> 
> Hmm.  I don't know whether that's an important optimization or not.

Based on the benchmark I'd say no. If we discover we need to go there we
can do so later. I don't forsee this to be really problematic.

Greetings,

Andres Freund

-- 
 Andres Freund                     http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical changeset generation v6.2

Reply via email to