Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-24 Thread Tomasz Rybak
I'm merging all your emails for sake of easier discussion.
I also cut all fragments that do not require response.

W dniu 22.01.2016, pią o godzinie 11∶06 +0800, użytkownik Craig Ringer
napisał:
> > We might also think about changing name of plugin to something
> > resembling "logical_streaming_decoder" or even "logical_streamer"
> > 
> I'm open to ideas there but I'd want some degree of consensus before
> undertaking the changes required. 

I know that it'd require much changes (both in this and pglogical 
plugin), and thus don't want to press for name change.
On one hand - changing of name might be good to avoid tight mental
coupling between pglogical_output and pglogica. At the same time - it's
much work, and I cannot think of any short and nice name, so 
pglogical_output might stay IMO.

 
> >  + subset of that database may be selected for replication,
> > currently based on
> > + table and on replication origin. Filtering by a WHERE clause can
> > be supported
> > + easily in future.
> > 
> > Is this filtering by table and replication origin implemented? I
> > haven't
> > noticed it in source.
> That's what the hooks are for.
> 

Current documentation suggests that replicating only selected is 
already available:
+ A subset of that database may be selected for replication, currently
+ based on table and on replication origin.

"currently based on table and on replication origin" means to me that
current state of plugin allows for just chosing which tables
to replicate. I'd see something like:

"A subset of that database might be selected for replication, e.g.
only chosen tables or changes from particular origin, in custom hook"

to convey that user needs to provide hook for filtering.

>  
> > + [the `CREATE_REPLICATION_SLOT ... LOGICAL ...` or
> > `START_REPLICATION SLOT ... LOGICAL ...` commands](http://www.postg
> > resql.org/docs/current/static/logicaldecoding-walsender.html) to
> > start streaming changes. (It can also be used via
> > + [SQL level functions](http://www.postgresql.org/docs/current/stat
> > ic/logicaldecoding-sql.html)
> > + over a non-replication connection, but this is mainly for
> > debugging purposes)
> > 
> > Replication slot can also be configured (causing output plugin to
> > be loaded) via [SQL level functions]...
> Covered in the next section. Or at least it is in the SGML docs
> conversion I'm still trying to finish off..

OK, then I'll wait for the final version to review that.

>  
> > + * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL
> > 'pglogical'` if it's setting up for the first time
> > 
> > * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL
> > 'pglogical'` to setup replication if it's connecting for the first
> > time
> I disagree. It's entirely possible to do your slot creation/setup
> manually or via something else, re-use a slot first created by
> another node, etc. Slot creation is part of client setup, not so much
> connection.

I'd propose then:
' * Client issues "CREATE_REPLICATION_SLOT ..." if the replication
was not configured earler, e.g. during previous connection, or manually
via [SQL functions | link to documentation]"


> > + If your application creates its own slots on first use and hasn't
> > previously
> > + connected to this database on this system you'll need to create a
> > replication
> > + slot. This keeps track of the client's replay state even while
> > it's disconnected.
> > 
> > If your application hasn't previously connected to this database on
> > this system
> > it'll need to create and configure replication slot which keeps
> > track of the
> > client's replay state even while it's disconnected.
> As above, I don't quite agree.
>  

"If your application hasn't previously connedted to this database on
this system, and the replication slot was not configured through other
means (e.g. manually using [SQL functions | URL ] then you'll need
to create and configure replication slot ..."



> > 
> >  DESIGN.md:
> > 
> > + attnos don't necessarily correspond. The column names might, and
> > their ordering
> > + might even be the same, but any column drop or column type change
> > will result
> > 
> > The column names and their ordering might even be the same...
> I disagree, that has a different meaning. It's also not really user-
> facing docs so I'm not too worried about being quite as readable.
> 

I do not try to change meaning but fix grammar. I'd like to have here
either more or less commas. So either:

+ The column names (and their ordering) might ...

or:

+ The column names, and their ordering, might ...



> > Is it true (no way to access change data)? You added passing change
> > to C hooks; from looking at code it looks like it's true, but I
> > want to be sure.
> While the change data is now passed to the C hook, there's no attempt
> to expose it via PL/PgSQL. So yeah, that's still true.
> 

Thanks for confirming.

Speaking about flags - in most cases they are 0; only for
> > attributes
> > we might have: 
> > 

Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-24 Thread Andres Freund
On 2016-01-18 21:47:27 +, Tomasz Rybak wrote:
> We might also think about changing name of plugin to something resembling 
> "logical_streaming_decoder" or even "logical_streamer"

FWIW, I find those proposals unconvincing. Not that pglogical_output is
grand, but "streaming decoder" or "logical_streamer" aren't even
correct. And output plugin isn't a "decoder" or a "streamer".


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-21 Thread Robert Haas
On Wed, Jan 20, 2016 at 8:04 PM, Craig Ringer  wrote:
> itself is an abbreviation of its self.

I do not think this is true.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-21 Thread Robert Haas
On Wed, Jan 20, 2016 at 12:54 AM, Craig Ringer  wrote:
> The idea here is that we want downwards compatibility as far as possible and
> maintainable but we can't really be upwards compatible for breaking protocol
> revisions. So the output plugin's native protocol version is inherently the
> max protocol version and we don't need a separate MAX.

That idea seems right to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-21 Thread Craig Ringer
On 22 January 2016 at 06:13, Tomasz Rybak  wrote:


> + data stream. The output stream is designed to be compact and fast to
> decode,
> + and the plugin supports upstream filtering of data so that only the
> required
> + information is sent.
>
> plugin supports upstream filtering of data through hooks so that ...
>
>
Ok.


> + subset of that database may be selected for replication, currently based
> on
> + table and on replication origin. Filtering by a WHERE clause can be
> supported
> + easily in future.
>
> Is this filtering by table and replication origin implemented? I haven't
> noticed it in source.
>

That's what the hooks are for.


> + other daemon is required. It's accumulated using
>
> Stream of changes is accumulated...
>

Ok.


> + [the `CREATE_REPLICATION_SLOT ... LOGICAL ...` or `START_REPLICATION
> SLOT ... LOGICAL ...` commands](
> http://www.postgresql.org/docs/current/static/logicaldecoding-walsender.html)
> to start streaming changes. (It can also be used via
> + [SQL level functions](
> http://www.postgresql.org/docs/current/static/logicaldecoding-sql.html)
> + over a non-replication connection, but this is mainly for debugging
> purposes)
>
> Replication slot can also be configured (causing output plugin to be
> loaded) via [SQL level functions]...
>

Covered in the next section. Or at least it is in the SGML docs conversion
I'm still trying to finish off..


> + * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'`
> if it's setting up for the first time
>
> * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'` to
> setup replication if it's connecting for the first time
>

I disagree. It's entirely possible to do your slot creation/setup manually
or via something else, re-use a slot first created by another node, etc.
Slot creation is part of client setup, not so much connection.


> + Details are in the replication protocol docs.
>
> Add link to file with protocol documentation.
>

Is done in SGML conversion.


> + If your application creates its own slots on first use and hasn't
> previously
> + connected to this database on this system you'll need to create a
> replication
> + slot. This keeps track of the client's replay state even while it's
> disconnected.
>
> If your application hasn't previously connected to this database on this
> system
> it'll need to create and configure replication slot which keeps track of
> the
> client's replay state even while it's disconnected.
>

As above, I don't quite agree.


> + `pglogical`'s output plugin now sends a continuous series of `CopyData`
>
> As this is separate plugin, use 'pglogical_output' plugin now sends...
> (not only here but also in few other places).
>

Thanks. Done.

+ All hooks are called in their own memory context, which lasts for the
> duration
>
> All hooks are called in separate hook memory context, which lasts for the
> duration...
>

I don't see the difference, but OK.

Simplified:

> All hooks are called in a memory context that lasts ...


> + + switched to a longer lived memory context like TopMemoryContext.
> Memory allocated
> + + in the hook context will be automatically when the decoding session
> shuts down.
>
> ...will be automatically freed when the decoding...
>

Fixed, thanks.


> DDL for global object changes must be synchronized via some external means.
>
> Just:
> Global object changes must be synchronized via some external means.
>

Agree, done.


> + determine why an error occurs in a downstream, since you can examine a
> + json-ified representation of the xact. It's necessary to supply a minimal
>
> since you can examine a transaction in json (and not binary) format. It's
> necessary
>

Ok, done.


> + Once you've peeked the stream and know the LSN you want to discard up
> to, you
> + can use `pg_logical_slot_peek_changes`, specifying an `upto_lsn`, to
> consume
>
> Shouldn't it be pg_logical_slot_get_changes? get_changes consumes changes,
> peek_changes leaves them in the stream. Especially as example below
> points that we need to use get_changes.


Yes. It should. Whoops. Thanks.


> DESIGN.md:
>
> + attnos don't necessarily correspond. The column names might, and their
> ordering
> + might even be the same, but any column drop or column type change will
> result
>
> The column names and their ordering might even be the same...
>

I disagree, that has a different meaning. It's also not really user-facing
docs so I'm not too worried about being quite as readable.

README.pglogical_output_plhooks:
>
> + Note that pglogical
> + must already be installed so that its headers can be found.
>
> Note that pglogical_output must already...
>

Thanks.


> + Arguments are the oid of the affected relation, and the change type:
> 'I'nsert,
> + 'U'pdate or 'D'elete. There is no way to access the change data -
> columns changed,
> + new values, etc.
>
> Is it true (no way to access change data)? You added passing change
> to C hooks; from looking at 

Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-20 Thread Craig Ringer
On 21 January 2016 at 06:23, Tomasz Rybak  wrote:


>
> I reviewed more files:


Thanks.

Can you try to put more whitespace between items? It can be hard to follow
at the moment.


> pglogical_proto_native.c
>
> +   pq_sendbyte(out, 'N');  /* column name block
> follows */
> +   attname = NameStr(att->attname);
> +   len = strlen(attname) + 1;
> +   pq_sendint(out, len, 2);
> +   pq_sendbytes(out, attname, len); /* data */
> Identifier names are limited to 63 - so why we're sending 2 bytes here?
>

Good question. It should be one byte. I'll need to amend the protocol for
that, but I don't think that's a problem this early on.


> +   pq_sendbyte(out, 'B');  /* BEGIN */
> +
> +   /* send the flags field its self */
> +   pq_sendbyte(out, flags);
> Comment: "flags field its self"? Shouldn't be "... itself"?
> Similarly in write_origin; write_insert just says:
>

itself is an abbreviation of its self.


> +   /* send the flags field */
> +   pq_sendbyte(out, flags);
>
> Speaking about flags - in most cases they are 0; only for attributes
> we might have:

+   flags |= IS_REPLICA_IDENTITY;
> I assume that flags field is put into protocol for future needs?
>

Correct. The protocol specifies (in most places; need to double check all
sites) that the lower 4 bits are reserved and must be treated as an ERROR
if set. The high 4 bits must be ignored if set and not recognised. That
gives us some room to wiggle without bumping the protocol version
incompatibly, and lets us use capabilities (via client-supplied parameters)
to add extra information on the wire.

+   /* FIXME support whole tuple (O tuple type) */
> +   if (oldtuple != NULL)
> +   {
> +   pq_sendbyte(out, 'K');  /* old key follows */
> +   pglogical_write_tuple(out, data, rel, oldtuple);
> +   }
> I don't fully understand. We are sending whole old tuple here,
> so this FIXME should be more about supporting sending just keys.
> But then comment "old key follows" is not true. Or am I missing
> something here?
>

In wal_level=logical the tuple that's written is an abbreviated tuple
containing data only for the REPLICA IDENTITY fields.

Ideally we'd also be able to support sending the _whole_ old tuple, but
this would require the ability to log the whole old tuple in WAL when
logging a DELETE or UPDATE into WAL. This isn't so much a FIXME as a
logical decoding limitation and wishlist item; I'll amend to that effect.




>
> +   pq_sendbyte(out, 'S');  /* message type field */
> +   pq_sendbyte(out, 1);/* startup message version */
> For now protocol is 1, but for code readability it might be better
> to change this line to:
> +   pq_sendbyte(out, PGLOGICAL_PROTO_VERSION_NUM);  /* startup message
> version */
>

The startup message format isn't the same as the protocol version.
Hopefully we'll never have to change it. The reason it's specified is so
that if we ever do bump it a decoding plugin can recognise an old client
and fall back. Maybe it's BC overkill but I'd kick myself for not doing it
if we ever decided to (say) add support for structured json startup options
from the client. Working on BDR has taught me that there's no such thing as
too much consideration for cross-version compat and negotiation in
replication.

I'm happy to create a new define for that an comment to this effect.


> Just for the sake of avoiding code repetition:
> +   for (i = 0; i < desc->natts; i++)
> +   {
> +   if (desc->attrs[i]->attisdropped)
> +   continue;
> +   nliveatts++;
> +   }
> +   pq_sendint(out, nliveatts, 2);



The exact same code is in write_tuple and write_attrs. I don't know what's
> policy for refactoring, but this might be extracted into separate function.
>

Seems trivial enough not to care, but probably can.


>
> +   else if (att->attlen == -1)
> +   {
> +   char *data =
> DatumGetPointer(values[i]);
> +
> +   /* send indirect datums inline */
> +   if
> (VARATT_IS_EXTERNAL_INDIRECT(values[i]))
> +   {
> +   struct varatt_indirect
> redirect;
> +
>  VARATT_EXTERNAL_GET_POINTER(redirect, data);
> +   data = (char *)
> redirect.pointer;
> +   }



I really don't like this. We have function parameter "data" and now
> are creating new variable with the same name.


I agree. Good catch.

I don't much like the use of 'data' as the param name for the plugin
private data and am quite inclined to change that instead, to
plugin_private or something.

pglogical_proto_json.c
>
> +   

Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-20 Thread Tomasz Rybak
W dniu 20.01.2016, śro o godzinie 13∶54 +0800, użytkownik Craig Ringer
napisał:
> On 20 January 2016 at 06:23, Tomasz Rybak 
> wrote:
> > The following review has been posted through the commitfest
> > application:
> > 
> Thanks!
>  
> >  
> > + /* Protocol capabilities */
> > + #define PGLOGICAL_PROTO_VERSION_NUM 1
> > + #define PGLOGICAL_PROTO_MIN_VERSION_NUM 1
> > Is this protocol version number and minimal recognized version
> > number,
> > or major and minor version number? I assume that
> > PGLOGICAL_PROTO_VERSION_NUM
> > is current protocol version (as in config max_proto_version and
> > min_proto_version). But why we have MIN_VERSION and not
> > MAX_VERSION?
> > 
> > From code in pglogical_output.c (lines 215-225 it looks like
> > PGLOGICAL_PROTO_VERSION_NUM is maximum, and
> > PGLOGICAL_PROTO_MIN_VERSION_NUM
> > is treated as minimal protocol version number.
> > 
> > I can see that those constants are exported in
> > pglogical_infofuncs.c and
> > pglogical.sql, but I do not understand omission of MAX.
> Thanks for stopping to think about this. It's one of the areas I
> really want to get right but I'm not totally confident of.
> 
> The idea here is that we want downwards compatibility as far as
> possible and maintainable but we can't really be upwards compatible
> for breaking protocol revisions. So the output plugin's native
> protocol version is inherently the max protocol version and we don't
> need a separate MAX.
> 
> The downstream connects and declares to the upstream "I speak
> protocol 2 through 3". The upstream sees this and replies "I speak
> protocol 1 through 2. We have protocol 2 in common so I will use
> that." Or alternately replies with an error "I only speak protocol 1
> so we have no protocol in common". This is done via the initial
> parameters passed by the downstream to the logical decoding plugin
> and then via the startup reply message that's the first message on
> the logical decoding stream.
> 
> We can't do a better handshake than this because the underlying
> walsender protocol and output plugin API only gives us one chance to
> send free-form information to the output plugin and it has to do so
> before the output plugin can send anything to the downstream.
> 
> As much as possible I want to avoid ever needing to do a protocol
> bump anyway, since it'll involve adding conditionals and duplication.
> That's why the protocol tries so hard to be extensible and rely on
> declared capabilities rather than protocol version bumps. But I'd
> rather plan for it than be unable to ever do it in future.
> 
> Much (all?) of this is discussed in the protocol docs. I should
> probably double check that and add a comment that refers to them
> there.
> 

Thanks for explanation. I'll think about it more and try to propose
something for this.

Best regards.

-- 
Tomasz Rybak  GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-19 Thread Craig Ringer
On 20 January 2016 at 06:23, Tomasz Rybak  wrote:

> The following review has been posted through the commitfest application:
>

Thanks!


>
> + /* Protocol capabilities */
> + #define PGLOGICAL_PROTO_VERSION_NUM 1
> + #define PGLOGICAL_PROTO_MIN_VERSION_NUM 1
> Is this protocol version number and minimal recognized version number,
> or major and minor version number? I assume that
> PGLOGICAL_PROTO_VERSION_NUM
> is current protocol version (as in config max_proto_version and
> min_proto_version). But why we have MIN_VERSION and not MAX_VERSION?
>
> From code in pglogical_output.c (lines 215-225 it looks like
> PGLOGICAL_PROTO_VERSION_NUM is maximum, and PGLOGICAL_PROTO_MIN_VERSION_NUM
> is treated as minimal protocol version number.
>
> I can see that those constants are exported in pglogical_infofuncs.c and
> pglogical.sql, but I do not understand omission of MAX.
>

Thanks for stopping to think about this. It's one of the areas I really
want to get right but I'm not totally confident of.

The idea here is that we want downwards compatibility as far as possible
and maintainable but we can't really be upwards compatible for breaking
protocol revisions. So the output plugin's native protocol version is
inherently the max protocol version and we don't need a separate MAX.

The downstream connects and declares to the upstream "I speak protocol 2
through 3". The upstream sees this and replies "I speak protocol 1 through
2. We have protocol 2 in common so I will use that." Or alternately replies
with an error "I only speak protocol 1 so we have no protocol in common".
This is done via the initial parameters passed by the downstream to the
logical decoding plugin and then via the startup reply message that's the
first message on the logical decoding stream.

We can't do a better handshake than this because the underlying walsender
protocol and output plugin API only gives us one chance to send free-form
information to the output plugin and it has to do so before the output
plugin can send anything to the downstream.

As much as possible I want to avoid ever needing to do a protocol bump
anyway, since it'll involve adding conditionals and duplication. That's why
the protocol tries so hard to be extensible and rely on declared
capabilities rather than protocol version bumps. But I'd rather plan for it
than be unable to ever do it in future.

Much (all?) of this is discussed in the protocol docs. I should probably
double check that and add a comment that refers to them there.



> + typedef struct HookFuncName
>

Thanks. That's residue from the prior implementation of hooks, which used
direct pg_proc lookups and cached the FmgrInfo in a dynahash. It's no
longer required now that we're using a single hook entry point and direct C
function calls. Dead code, removed.


> + typedef struct PGLogicalTupleData
> I haven't found those used anything, and they are not mentioned in
> documentation. Are those structures needed?
>

That snuck in from the pglogical downstream during the split into a
separate tree. It's dead code as far as pglogical_output is concerned.
Removed.


> + pglogical_config.c:
> +   switch(get_param_key(elem->defname))
> +   {
> +   val = get_param_value(elem, false,
> OUTPUT_PARAM_TYPE_UINT32);

Why do we need this line here? All cases contain some variant of
> val = get_param_value(elem, false, TYPE);
> as first line after "case PARAM_*:" so this line should be removed.
>

Correct. That seems to be an escapee editing error. Thanks, removed.


> +   val = get_param(options, "startup_params_format", false, false,
> +   OUTPUT_PARAM_TYPE_UINT32, );
> +
> +   params_format = DatumGetUInt32(val);
> Shouldn't we check "found" here? We work with val (which is Datum(0)) -
> won't
> it throw SIGFAULT, or similar?
>

get_param is called with missing_ok=false so it ERRORs and can never return
!found . In any case it'd return (Datum)0 so we'd just get (uint32)0 not a
crash.

To make this clearer I've changed get_param so it supports NULL as a value
for found.


> Additionally - I haven't found any case where code would use "found"
> passed from get_param() - so as it's not used it might be removed.
>

Probably, but I expect it to be useful later. It was used before a
restructure of how params get read. I don't mind removing it if you feel
strongly about it, but it'll probably just land up being added back at some
point.


>
>
+   elog(DEBUG1, "Binary mode rejected: Server and client
> endian sizeof(long) mismatch");
> Isn't "endian" here case of copy-paste from first error?
>

Yes, it is. Thanks.


> + static void pg_decode_shutdown(LogicalDecodingContext * ctx)
> In pg_decode_startup we create main memory context, and create hooks memory
> context. In pg_decode_shutdown we delete hooks memory context but not main
> context. Is this OK, or should we also add:
> 

Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-04 Thread Shulgin, Oleksandr
On Sun, Jan 3, 2016 at 7:21 PM, Tomasz Rybak  wrote:

> The following review has been posted through the commitfest application:
> make installcheck-world:  tested, passed
> Implements feature:   not tested
> Spec compliant:   not tested
> Documentation:tested, failed
>
> Applies cleanly on current master
> (b416c0bb622ce5d33fdbec3bbce00451132f10ec).
>
> Builds without any problems on current Debian unstable (am64 arch, GCC
> 5.3.1-4, glibc 2.21-6)
>

A make from an external build dir fails on install, suggested fix:

install: all
$(MKDIR_P) '$(DESTDIR)$(includedir)'/pglogical_output
- $(INSTALL_DATA) pglogical_output/hooks.h
'$(DESTDIR)$(includedir)'/pglogical_output
+ $(INSTALL_DATA)
$(top_srcdir)/contrib/pglogical_output/pglogical_output/hooks.h
'$(DESTDIR)$(includedir)'/pglogical_output

--
Alex


Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-04 Thread Alvaro Herrera
Shulgin, Oleksandr wrote:

> A make from an external build dir fails on install, suggested fix:
> 
> install: all
> $(MKDIR_P) '$(DESTDIR)$(includedir)'/pglogical_output
> - $(INSTALL_DATA) pglogical_output/hooks.h
> '$(DESTDIR)$(includedir)'/pglogical_output
> + $(INSTALL_DATA)
> $(top_srcdir)/contrib/pglogical_output/pglogical_output/hooks.h
> '$(DESTDIR)$(includedir)'/pglogical_output

Actually you should be able to use $(srcdir)/hooks.h ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers