Re: [HACKERS] Logical Replication WIP

Peter Eisentraut Thu, 08 Sep 2016 21:34:50 -0700

Review of 0003-Define-logical-replication-protocol-and-output-plugi.patch:

(This is still based on the Aug 31 patch set, but at quick glance I
didn't see any significant changes in the Sep 8 set.)


Generally, this all seems mostly fine.  Everything is encapsulated
well enough that problems are localized and any tweaks don't affect
the overall work.

Changes needed to build:

--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2158,8 +2158,8 @@ <title>Logical Streaming Replication
Parameters</title>
      <listitem>
             <para>
             Comma separated list of publication names for which to
subscribe
-       (receive changes). See
-       <xref linkend="logical-replication-publication"> for more info.
+       (receive changes). <!-- See
+       <xref linkend="logical-replication-publication"> for more info. -->
            </para>
          </listitem>
       </varlistentry>
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -25,6 +25,7 @@
 #include "utils/builtins.h"
 #include "utils/inval.h"
 #include "utils/memutils.h"
+#include "utils/syscache.h"

 PG_MODULE_MAGIC;

This is all fixed in later patches.

AFAICT, pgoutput does not use libpq, so the mentions in
src/backend/replication/pgoutput/Makefile are not needed (perhaps
copied from libpqwalreceiver?).

The start_replication option pg_version option is not documented and
not used in any later patch.  We can probably do without it and just
rely on the protocol version.

In pgoutput_startup(), you check opt->output_type.  But it is not set
anywhere.  Actually, the startup callback is supposed to set it
itself.

In init_rel_sync_cache(), the way hash_flags is set seems kind of
weird.  I think that variable could be removed and the flags put
directly into the hash_create() call.

pgoutput_config.c seems over-engineered, e.g., converting cstring to
Datum and back.  Just do normal DefElem list parsing in pgoutput.c.
That's not pretty either, but at least it's a common coding pattern.

In the protocol documentation, explain the meaning of int64 as a
commit timestamp.

Also, the documentation should emphasize more clearly that all the
messages are not actually top-level protocol messages but are
contained inside binary copy data.

On the actual protocol messages:

Why do strings have a length byte?  That is not how other strings in
the protocol work.  As a minor side-effect, this would limit for
example column names to 255 characters.

The message structure doesn't match the documentation in some ways.
For example Attributes and TupleData are not separate messages but are
contained in Relation and Insert/Update/Delete messages.  So the
documentation needs to be structured a bit differently.

In the Attributes message (or actually Relation message), we don't
need the 'A' and 'C' bytes.

I'm not sure that pgoutput should concern itself with the client
encoding.  The client encoding should already be set by the initial
FE/BE protocol handshake.  I haven't checked that further yet, so it
might already work, or it should be made to work that way, or I might
be way off.

Slight abuse of pqformat functions.  We're not composing messages
using pq_beginmessage()/pq_endmessage(), and we're not using
pq_getmsgend() when reading.  The "proper" way to do this is probably
to define a custom set of PQcommMethods.  (low priority)

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Logical Replication WIP

Reply via email to