Re: [HACKERS] Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 - > 10.0

2016-04-30 Thread Tomasz Rybak
I cut many of emails from CC - AFAIR most of you are subscribed to 
pg-hackers.

Dnia 2016-04-30 19:29 Joshua D. Drake napisał(a):

>On 04/29/2016 11:50 AM, Joshua D. Drake wrote:
>> On 04/29/2016 11:36 AM, Simon Riggs wrote:
>>
>>> Egos.
>>>
>>> Consider PgLogical, who is working on this outside of 2Q?
>>>
>>>
>>> Thank you for volunteering to assist. What would you like to work on?
>>
>> You are very welcome. I have been testing as you know. I would be happy
>> to continue that and also was going to look into having a subscriber
>> validate if it is connecting to a subscribed node or not which is the
>> error I ran into.
>>
>> I am also interested in creating user docs (versus reference docs).
>
>So what do you think Simon? How about we put pgLogical under community 
>infrastructure, use the postgresql redmine to track bugs, feature 
>requests and documentation?
>
>I guarantee resources from CMD and I bet we could get others to 
>participate as well. Let's turn this into an awesome community driven 
>extension.
>

I reviewed pglogical-output extension in CF 2016-01. It was the only
patch I reviewed - it was quite big and as I was doing it afternoons
and not during work (other responsibilities) it took me more than half
of month. And while initially response was good, after final review there
was no response, no new patch version - also nothing in CF 2016-03.

At the same time I've seen that pglogical got some love in March - but
I'm not sure whether it is usable without *-output plugin. On the one
hand splitting those two makes review easier, or at least manageable.
OTOH one does not make much sense without the other. I can see that,
at least in theory, pglogical-output could be used independently, but
at the same time, its main user will be pglogical proper.

So, in summary - slightly better management and communication regarding
features and patches (not only this one; this is just the patch with
which I tried to do review) would be beneficial.
For now I'm not sure what is going on with pglogical,
and whether my review even mattered.

Best regards
Tomasz Rybak
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Interesting read on SCM upending software and hardware architecture

2016-01-27 Thread Tomasz Rybak
W dniu 18.01.2016, pon o godzinie 18∶55 -0600, użytkownik Jim Nasby
napisał:
[ cut ]
> 
> My original article doesn't talk about SSDs; it's talking about 
> non-volatile memory architectures (quoted extract below). Fusion IO
> is 
> an example of this, and if NVDIMMs become available we'll see even 
> faster non-volatile performance.
> 
> To me, the most interesting point the article makes is that systems
> now 
> need much better support for multiple classes of NV storage. I agree 
> with your point that spinning rust is here to stay for a long time, 
> simply because it's cheap as heck. So systems need to become much
> better 
> at moving data between different layers of NV storage so that you're 
> getting the biggest bang for the buck. That will remain critical as
> long 
> as SCM's remain 25x more expensive than rust.
> 

I guess PostgreSQL is getting ready for such a world.
Parallel sequential scan, while not useful for spinning drives,
should shine with hardware describe in that article.

Add some tuning of effective_io_concurrency and we might
have some gains even without new storage layer.
Of course ability to change storage subsystem might
help with experimentation, but even now (OK, when 9.6 is out)
we might use increased IO concurrency.

-- 
Tomasz Rybak  GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak




signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-24 Thread Tomasz Rybak
 ...



> > Is it true (no way to access change data)? You added passing change
> > to C hooks; from looking at code it looks like it's true, but I
> > want to be sure.
> While the change data is now passed to the C hook, there's no attempt
> to expose it via PL/PgSQL. So yeah, that's still true.
> 

Thanks for confirming.

Speaking about flags - in most cases they are 0; only for
> > attributes
> > we might have: 
> >  +   flags |= IS_REPLICA_IDENTITY;
> > I assume that flags field is put into protocol for future needs?
> Correct. The protocol specifies (in most places; need to double check
> all sites) that the lower 4 bits are reserved and must be treated as
> an ERROR if set. The high 4 bits must be ignored if set and not
> recognised. That gives us some room to wiggle without bumping the
> protocol version incompatibly, and lets us use capabilities (via
> client-supplied parameters) to add extra information on the wire.
> 

Thanks for explanation. You're right - protocol.txt explains that
quite nicely.


+   /* FIXME support whole tuple (O tuple type) */
> > +   if (oldtuple != NULL)
> > +   {
> > +   pq_sendbyte(out, 'K');  /* old key follows */
> > +   pglogical_write_tuple(out, data, rel, oldtuple);
> > +   }
> > I don't fully understand. We are sending whole old tuple here,
> > so this FIXME should be more about supporting sending just keys.
> > But then comment "old key follows" is not true. Or am I missing
> > something here?
> In wal_level=logical the tuple that's written is an abbreviated tuple
> containing data only for the REPLICA IDENTITY fields.
> 

I didn't know that - thanks for explanation.


 
> > +   pq_sendbyte(out, 'S');  /* message type field */
> > +   pq_sendbyte(out, 1);/* startup message version */
> > For now protocol is 1, but for code readability it might be better
> > to change this line to:
> > +   pq_sendbyte(out, PGLOGICAL_PROTO_VERSION_NUM);  /* startup
> > message version */
> The startup message format isn't the same as the protocol version.
> Hopefully we'll never have to change it. The reason it's specified is
> so that if we ever do bump it a decoding plugin can recognise an old
> client and fall back. Maybe it's BC overkill but I'd kick myself for
> not doing it if we ever decided to (say) add support for structured
> json startup options from the client. Working on BDR has taught me
> that there's no such thing as too much consideration for cross-
> version compat and negotiation in replication.

Then I'd suggest adding named constant for this - so startup message
version is not mistaken for protocol version. Something like:

+ #define PGLOGICA_STARTUP_MESSAGE_VERSION_NUM 1

This way it is obvious that "1" and "1" do not mean that same.

Just for the sake of avoiding code repetition:
> > +   for (i = 0; i < desc->natts; i++)
> > +   {
> > +   if (desc->attrs[i]->attisdropped)
> > +   continue;
> > +   nliveatts++;
> > +   }
> > +   pq_sendint(out, nliveatts, 2); 
> >  
> >  The exact same code is in write_tuple and write_attrs. I don't
> > know what's
> > policy for refactoring, but this might be extracted into separate
> > function.
> Seems trivial enough not to care, but probably can.

IMO such code (computing number of live attributes in relation) should
be used often enough to deserve own function - isn't such function
available in PostgreSQL?



> > I really don't like this. We have function parameter "data" and
> > now
> > are creating new variable with the same name.
> I agree. Good catch.
> 
> I don't much like the use of 'data' as the param name for the plugin
> private data and am quite inclined to change that instead, to
> plugin_private or something.


Maybe rename function parameter to plugin_data or pluging_private_data?


+   /* Find cached function info, creating if not found */
> > +   hentry = (struct PGLRelMetaCacheEntry*)
> > hash_search(RelMetaCache,
> > + 
> >   (void *)((rel)),
> > + 
> >   HASH_ENTER, );
> > +
> > +   if (!found)
> > +   {
> > +   Assert(hentry->relid = RelationGetRelid(rel));
> > +   hentry->is_cached = false;
> > +   hentry->api_private = NULL;
> > +   }
> > +
> > +   Assert(hentry != NULL); 
> >  
> >  Shouldn't Assert be just after calling hash_search? We're (if
> > !found)
> > dereferencing hentry and only after checking whether it's not NULL.
> Yeah. It's more of an exit condition, stating "this function never
> returns non-null hentry" but moving it up is fine. Can do.

Please do so. It's more about conveying intent and increaing code
readability - so anyone knows that hentry should not be NULL
when returned by hash_search, and not some time later.


> Thanks again for this. Sorry it's taken so long to get the SGML docs
> converted. Too many other things on. 

No problem, I know how it is.

Best regards.

-- 
Tomasz Rybak  GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak




signature.asc
Description: This is a digitally signed message part


[HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-24 Thread Tomasz Rybak
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Final part of review:
protocol.txt

+|origin_identifier|signed char[origin_identifier_length]|An origin identifier 
of arbitrary, upstream-application-defined structure. _Should_ be text in the 
same encoding as the upstream database. NULL-terminated. _Should_ be 7-bit 
ASCII.

Does it need NULL-termination when previous field contains length of 
origin_identifier?
Similarly for relation metadata message.



+ metadata message. All consecutive row messages must currently have the same
+ relidentifier. (_Later extensions to add metadata caching will relax these
+ requirements for clients that advertise caching support; see the documentation
+ on metadata messages for more detail_).

Shouldn't this be changed as metadata cache is implemented?




+ |relidentifier|uint32|relidentifier that matches the table metadata message 
sent for this row.
+ (_Not present in BDR, which sends nspname and relname instead_)

and

+ |natts|uint16|Number of fields sent in this tuple part.
+ (_Present in BDR, but meaning significantly different here)_

Is BDR mention relevant here? It was not mentioned anywhere else, and now 
appears
ex machina.



Long quote - but required.


+  Tuple fields
+ 
+ |===
+ |Tuple type|signed char|Identifies the kind of tuple being sent.
+ 
+ |tupleformat|signed char|‘**T**’ (0x54)
+ |natts|uint16|Number of fields sent in this tuple part.
+ (_Present in BDR, but meaning significantly different here)_
+ |[tuple field values]|[composite]|
+ |===
+ 
+ = Tuple tupleformat compatibility
+ 
+ Unrecognised _tupleformat_ kinds are a protocol error for the downstream.
+ 
+  Tuple field value fields
+ 
+ These message parts describe individual fields within a tuple.
+ 
+ There are two kinds of tuple value fields, abbreviated and full. Which is 
being
+ read is determined based on the first field, _kind_.
+ 
+ Abbreviated tuple value fields are nothing but the message kind:
+ 
+ |===
+ |*Message*|*Type/Size*|*Notes*
+ 
+ |kind|signed char| * ‘**n**’ull (0x6e) field
+ |===
+ 
+ Full tuple value fields have a length and datum:
+ 
+ |===
+ |*Message*|*Type/Size*|*Notes*
+ 
+ |kind|signed char| * ‘**i**’nternal binary (0x62) field
+ |length|int4|Only defined for kind = i\|b\|t
+ |data|[length]|Data in a format defined by the table metadata and column 
_kind_.
+ |===
+ 
+ = Tuple field values kind compatibility
+ 
+ Unrecognised field _kind_ values are a protocol error for the downstream. The
+ downstream may not continue processing the protocol stream after this
+ point**.**
+ 
+ The upstream may not send ‘**i**’nternal or ‘**b**’inary format values to the
+ downstream without the downstream negotiating acceptance of such values. The
+ downstream will also generally negotiate to receive type information to use to
+ decode the values. See the section on startup parameters and the startup
+ message for details.

I do not fully get it.
For each tuple we are supposed to have "Tuple type" (which is kind?). Does it
mean that T1 might be sent using "i" kind and T2 sent using "b" kind?
At the same tme we have kind "n" (null) - but it belongs to field level
(one field might be null, not entire tuple).

In other words - do we have "i" and then "T" and then number of attributes,
or "T', then number of attributes, then "i" or "b" or "n" for each of 
attributes?

Also - description of "b" seems missing.






+ Before sending changed rows for a relation, a metadata message for the 
relation
+ must be sent so the downstream knows the namespace, table name, column names,
+ optional column types, etc. A relidentifier field, an arbitrary numeric value
+ unique for that relation on that upstream connection, maps the metadata to
+ following rows.
+ 
+ A client should not assume that relation metadata will be followed immediately
+ (or at all) by rows, since future changes may lead to metadata messages being
+ delivered at other times. Metadata messages may arrive during or between
+ transactions.
+ 
+ The upstream may not assume that the downstream retains more metadata than the
+ one most recent table metadata message. This applies across all tables, so a
+ client is permitted to discard metadata for table x when getting metadata for
+ table y. The upstream must send a new metadata message before sending rows for
+ a different table, even if that metadata was already sent in the same session
+ or even same transaction. _This requirement will later be weakened by the
+ addition of client metadata caching, which will be advertised to the upstream
+ with an output plugin parameter._

This needs reworking while metadata caching is supported




+ |Message type|signed char|‘**S**’ (0x53) - startup
+ |Startup message version|uint8|Value is always “1”.

Value is "1" for the current plugin version. It 

[HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-21 Thread Tomasz Rybak
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

Documentation - although I haven't yet went through protocol documentation:

README.md

+ data stream. The output stream is designed to be compact and fast to decode,
+ and the plugin supports upstream filtering of data so that only the required
+ information is sent.

plugin supports upstream filtering of data through hooks so that ...



+ subset of that database may be selected for replication, currently based on
+ table and on replication origin. Filtering by a WHERE clause can be supported
+ easily in future.

Is this filtering by table and replication origin implemented? I haven't
noticed it in source.



+ other daemon is required. It's accumulated using

Stream of changes is accumulated...



+ [the `CREATE_REPLICATION_SLOT ... LOGICAL ...` or `START_REPLICATION SLOT ... 
LOGICAL ...` 
commands](http://www.postgresql.org/docs/current/static/logicaldecoding-walsender.html)
 to start streaming changes. (It can also be used via
+ [SQL level 
functions](http://www.postgresql.org/docs/current/static/logicaldecoding-sql.html)
+ over a non-replication connection, but this is mainly for debugging purposes)

Replication slot can also be configured (causing output plugin to be loaded) 
via [SQL level functions]...




+ The overall flow of client/server interaction is:

The overall flow of client/server interaction is as follows:



+ * Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'` if 
it's setting up for the first time

* Client issues `CREATE_REPLICATION_SLOT slotname LOGICAL 'pglogical'` to setup 
replication if it's connecting for the first time



+ Details are in the replication protocol docs.

Add link to file with protocol documentation.




+ If your application creates its own slots on first use and hasn't previously
+ connected to this database on this system you'll need to create a replication
+ slot. This keeps track of the client's replay state even while it's 
disconnected.

If your application hasn't previously connected to this database on this system
it'll need to create and configure replication slot which keeps track of the
client's replay state even while it's disconnected.




+ `pglogical`'s output plugin now sends a continuous series of `CopyData`

As this is separate plugin, use 'pglogical_output' plugin now sends...
(not only here but also in few other places).




+ All hooks are called in their own memory context, which lasts for the duration

All hooks are called in separate hook memory context, which lasts for the 
duration...





+ + switched to a longer lived memory context like TopMemoryContext. Memory 
allocated
+ + in the hook context will be automatically when the decoding session shuts 
down.

...will be automatically freed when the decoding...




DDL for global object changes must be synchronized via some external means.

Just:
Global object changes must be synchronized via some external means.





+ determine why an error occurs in a downstream, since you can examine a
+ json-ified representation of the xact. It's necessary to supply a minimal

since you can examine a transaction in json (and not binary) format. It's 
necessary





+ discard up to, as identifed by LSN (log sequence number). See

identified




+ Once you've peeked the stream and know the LSN you want to discard up to, you
+ can use `pg_logical_slot_peek_changes`, specifying an `upto_lsn`, to consume

Shouldn't it be pg_logical_slot_get_changes? get_changes consumes changes,
peek_changes leaves them in the stream. Especially as example below
points that we need to use get_changes.



+ tp to but not including that point, i.e. that will be the
+ point at which replay resumes.

IMO it's better to introduce new sentence:
that point. This will be the point at which replay resumes.



DESIGN.md:

+ attnos don't necessarily correspond. The column names might, and their 
ordering
+ might even be the same, but any column drop or column type change will result

The column names and their ordering might even be the same...









README.pglogical_output_plhooks:

+ Note that pglogical
+ must already be installed so that its headers can be found.

Note that pglogical_output must already...



+ Arguments are the oid of the affected relation, and the change type: 'I'nsert,
+ 'U'pdate or 'D'elete. There is no way to access the change data - columns 
changed,
+ new values, etc.

Is it true (no way to access change data)? You added passing change
to C hooks; from looking at code it looks like it's true, but I want to be sure.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-20 Thread Tomasz Rybak
W dniu 20.01.2016, śro o godzinie 13∶54 +0800, użytkownik Craig Ringer
napisał:
> On 20 January 2016 at 06:23, Tomasz Rybak <tomasz.ry...@post.pl>
> wrote:
> > The following review has been posted through the commitfest
> > application:
> > 
> Thanks!
>  
> >  
> > + /* Protocol capabilities */
> > + #define PGLOGICAL_PROTO_VERSION_NUM 1
> > + #define PGLOGICAL_PROTO_MIN_VERSION_NUM 1
> > Is this protocol version number and minimal recognized version
> > number,
> > or major and minor version number? I assume that
> > PGLOGICAL_PROTO_VERSION_NUM
> > is current protocol version (as in config max_proto_version and
> > min_proto_version). But why we have MIN_VERSION and not
> > MAX_VERSION?
> > 
> > From code in pglogical_output.c (lines 215-225 it looks like
> > PGLOGICAL_PROTO_VERSION_NUM is maximum, and
> > PGLOGICAL_PROTO_MIN_VERSION_NUM
> > is treated as minimal protocol version number.
> > 
> > I can see that those constants are exported in
> > pglogical_infofuncs.c and
> > pglogical.sql, but I do not understand omission of MAX.
> Thanks for stopping to think about this. It's one of the areas I
> really want to get right but I'm not totally confident of.
> 
> The idea here is that we want downwards compatibility as far as
> possible and maintainable but we can't really be upwards compatible
> for breaking protocol revisions. So the output plugin's native
> protocol version is inherently the max protocol version and we don't
> need a separate MAX.
> 
> The downstream connects and declares to the upstream "I speak
> protocol 2 through 3". The upstream sees this and replies "I speak
> protocol 1 through 2. We have protocol 2 in common so I will use
> that." Or alternately replies with an error "I only speak protocol 1
> so we have no protocol in common". This is done via the initial
> parameters passed by the downstream to the logical decoding plugin
> and then via the startup reply message that's the first message on
> the logical decoding stream.
> 
> We can't do a better handshake than this because the underlying
> walsender protocol and output plugin API only gives us one chance to
> send free-form information to the output plugin and it has to do so
> before the output plugin can send anything to the downstream.
> 
> As much as possible I want to avoid ever needing to do a protocol
> bump anyway, since it'll involve adding conditionals and duplication.
> That's why the protocol tries so hard to be extensible and rely on
> declared capabilities rather than protocol version bumps. But I'd
> rather plan for it than be unable to ever do it in future.
> 
> Much (all?) of this is discussed in the protocol docs. I should
> probably double check that and add a comment that refers to them
> there.
> 

Thanks for explanation. I'll think about it more and try to propose
something for this.

Best regards.

-- 
Tomasz Rybak  GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak




signature.asc
Description: This is a digitally signed message part


[HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-20 Thread Tomasz Rybak
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

I revied more files:

pglogical_proto_native.c

+   pq_sendbyte(out, 'N');  /* column name block follows */
+   attname = NameStr(att->attname);
+   len = strlen(attname) + 1;
+   pq_sendint(out, len, 2);
+   pq_sendbytes(out, attname, len); /* data */
Identifier names are limited to 63 - so why we're sending 2 bytes here?
I do not have hard feelings about this - just curiousity. For:
+   pq_sendbyte(out, nspnamelen);   /* schema name length */
+   pq_sendbytes(out, nspname, nspnamelen);

+   pq_sendbyte(out, relnamelen);   /* table name length */
+   pq_sendbytes(out, relname, relnamelen);
schema and relation name we send 1 byte, for attribute 2. Strange, bit 
inconsistent.

+   pq_sendbyte(out, 'B');  /* BEGIN */
+ 
+   /* send the flags field its self */
+   pq_sendbyte(out, flags);
Comment: "flags field its self"? Shouldn't be "... itself"?
Similarly in write_origin; write_insert just says:
+   /* send the flags field */
+   pq_sendbyte(out, flags);

Speaking about flags - in most cases they are 0; only for attributes
we might have:
+   flags |= IS_REPLICA_IDENTITY;
I assume that flags field is put into protocol for future needs?


+   /* FIXME support whole tuple (O tuple type) */
+   if (oldtuple != NULL)
+   {
+   pq_sendbyte(out, 'K');  /* old key follows */
+   pglogical_write_tuple(out, data, rel, oldtuple);
+   }
I don't fully understand. We are sending whole old tuple here,
so this FIXME should be more about supporting sending just keys.
But then comment "old key follows" is not true. Or am I missing
something here?
Similarly for write_delete.

+   pq_sendbyte(out, 'S');  /* message type field */
+   pq_sendbyte(out, 1);/* startup message version */
For now protocol is 1, but for code readability it might be better
to change this line to:
+   pq_sendbyte(out, PGLOGICAL_PROTO_VERSION_NUM);  /* startup message 
version */


Just for the sake of avoiding code repetition:
+   for (i = 0; i < desc->natts; i++)
+   {
+   if (desc->attrs[i]->attisdropped)
+   continue;
+   nliveatts++;
+   }
+   pq_sendint(out, nliveatts, 2);
The exact same code is in write_tuple and write_attrs. I don't know what's
policy for refactoring, but this might be extracted into separate function.

+   else if (att->attlen == -1)
+   {
+   char *data = DatumGetPointer(values[i]);
+ 
+   /* send indirect datums inline */
+   if 
(VARATT_IS_EXTERNAL_INDIRECT(values[i]))
+   {
+   struct varatt_indirect redirect;
+   
VARATT_EXTERNAL_GET_POINTER(redirect, data);
+   data = (char *) 
redirect.pointer;
+   }
I really don't like this. We have function parameter "data" and now
are creating new variable with the same name. It might lead to confusion
and some long debugging sessions. Please change the name of this variable.
Maybe attr_data would be OK? Or outputbytes, like below, for case 'b' and 
default?


pglogical_proto_json.c

+   appendStringInfo(out, ", \"origin_lsn\":\"%X/%X\"",
+   (uint32)(txn->origin_lsn >> 32), 
(uint32)(txn->origin_lsn));
I remember there was discussion on *-hackers recently about %X/%X; I'll
try to find it and check whether it's according to final conclusion.


pglogical_relmetacache.c

First. In pglogical_output.c, in pg_decode_startup we are calling
init_relmetacache. I haven't found call to destroy_relmetacache
and comment says that it must be called at backend shutdown.
Is it guaranteed? Or will cache get freed with its context?

+   /* Find cached function info, creating if not found */
+   hentry = (struct PGLRelMetaCacheEntry*) hash_search(RelMetaCache,
+   
 (void *)((rel)),
+   
 HASH_ENTER, );
+ 
+   if (!found)
+   {
+   Assert(hentry->relid = RelationGetRelid(rel));
+   hentry->is_cached = false;
+   hentry->api_private = NULL;
+   }
+ 
+   Assert(hentry != NULL);
Shouldn't Assert be just after calling hash_search? We're (if !found)
dereferencing hentry and only after checking whether it's not NULL.

I haven't 

[HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-19 Thread Tomasz Rybak
The following review has been posted through the commitfest application:
make installcheck-world:  not tested
Implements feature:   not tested
Spec compliant:   not tested
Documentation:not tested

First part of code review (for about 1/3rd of code):
pglogical_output.h:

+ /* Protocol capabilities */
+ #define PGLOGICAL_PROTO_VERSION_NUM 1
+ #define PGLOGICAL_PROTO_MIN_VERSION_NUM 1
Is this protocol version number and minimal recognized version number,
or major and minor version number? I assume that PGLOGICAL_PROTO_VERSION_NUM
is current protocol version (as in config max_proto_version and
min_proto_version). But why we have MIN_VERSION and not MAX_VERSION?

From code in pglogical_output.c (lines 215-225 it looks like
PGLOGICAL_PROTO_VERSION_NUM is maximum, and PGLOGICAL_PROTO_MIN_VERSION_NUM
is treated as minimal protocol version number.

I can see that those constants are exported in pglogical_infofuncs.c and
pglogical.sql, but I do not understand omission of MAX.


+ typedef struct HookFuncName
+ typedef struct PGLogicalTupleData
I haven't found those used anything, and they are not mentioned in
documentation. Are those structures needed?


+ pglogical_config.c:
+   switch(get_param_key(elem->defname))
+   {
+   val = get_param_value(elem, false, 
OUTPUT_PARAM_TYPE_UINT32);
Why do we need this line here? All cases contain some variant of
val = get_param_value(elem, false, TYPE);
as first line after "case PARAM_*:" so this line should be removed.

+   val = get_param(options, "startup_params_format", false, false,
+   OUTPUT_PARAM_TYPE_UINT32, );
+ 
+   params_format = DatumGetUInt32(val);
Shouldn't we check "found" here? We work with val (which is Datum(0)) - won't
it throw SIGFAULT, or similar?
Additionally - I haven't found any case where code would use "found"
passed from get_param() - so as it's not used it might be removed.


pglogical_output.c:

+   elog(DEBUG1, "Binary mode rejected: Server and client endian 
sizeof(Datum) mismatch");
+   return false;
+   }
+ 
+   if (data->client_binary_sizeofint != 0
+   && data->client_binary_sizeofint != sizeof(int))
+   {
+   elog(DEBUG1, "Binary mode rejected: Server and client endian 
sizeof(int) mismatch");
+   return false;
+   }
+ 
+   if (data->client_binary_sizeoflong != 0
+   && data->client_binary_sizeoflong != sizeof(long))
+   {
+   elog(DEBUG1, "Binary mode rejected: Server and client endian 
sizeof(long) mismatch");
Isn't "endian" here case of copy-paste from first error?
Error messages should rather look like:
elog(DEBUG1, "Binary mode rejected: Server and client 
sizeof(Datum) mismatch");

+ static void pg_decode_shutdown(LogicalDecodingContext * ctx)
In pg_decode_startup we create main memory context, and create hooks memory
context. In pg_decode_shutdown we delete hooks memory context but not main
context. Is this OK, or should we also add:
MemoryContextDelete(data->context);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-18 Thread Tomasz Rybak
I just quickly went through patch v5.
It's rather big patch, on (or beyond) my knowledge of PostgreSQL to perform 
high-quality review. But during this week I'll try to send reviews of parts of 
the code, as going through it in one sitting seems impossible.

One proposed change - update copyright to 2016.

i'd also propose to change of pglogical_output_control.in:
comment = 'general purpose logical decoding plugin'
to something like "general-purpoer plugin decoding and generating stream of 
logical changes"

We might also think about changing name of plugin to something resembling 
"logical_streaming_decoder" or even "logical_streamer"
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pglogical_output - a general purpose logical decoding output plugin

2016-01-18 Thread Tomasz Rybak
 > all possible features.  A common core approach could probably be
> > made
> > acceptable with the argument that anyone will probably want to do
> > things
> > this way, so we might as well implement it once and give it to
> > people.
> That's what we're going for here. Extensible, something people can
> build on and use.
>  
> >  In a way, the logical decoding interface is the common core, as we
> > currently understand it.  But this submission clearly has a lot of
> > features beyond just the basics
> Really? What would you cut? What's beyond the basics here? What
> basics are you thinking of, i.e what set of requirements are you
> working towards / needs are you seeking to meet?
> 
> We cut this to the bone to produce a minimum viable logical
> replication solution. Especially the output plugin.
> 
> Cut the hook interfaces for row and xact filtering? You lose the
> ability to use replication origins, crippling functionality, and for
> no real gain in simplicity.
> 
> Remove JSON support? That's what most people are actually likely to
> want to use when using the output plugin directly, and it's important
> for debugging/tracing/diagnostics. It's a separate feature, to be
> sure, but it's also a pretty trivial addition.
>  
> >  and we could probably go through them
> > one by one and ask, why do we need this bit?  So that kind of
> > system
> > will be very hard to review as a standalone submission.
> > 
> Again, I disagree. I think you're looking at this way too narrowly.
> 
> I find it quite funny, actually. Here we go and produce something
> that's a nice re-usable component that other people can use in their
> products and solutions ... and all anyone does is complain that the
> other part required to use it as a canned product isn't posted yet
> (though it is now). But with BDR all anyone ever does is complain
> that it's too tightly coupled to the needs of a single product and
> the features extracted from it, like replication origins, should be
> more generic and general purpose so other people can use them in
> their products too. Which is it going to be?
> 
> It would be helpful if you could take a step back and describe what
> *you* think logical replication for PostgreSQL should look like. You
> clearly have a picture in mind of what it should be, what
> requirements it satisfies, etc. If you're going to argue based on
> that it'd be very helpful to describe it. I might've missed some
> important points you've seen and you might've overlooked issues I've
> seen. 
> 

This is rather long, but I do not want to cut to much, because
it shows slight problem with workflow in PostgreSQL community.
I'm writing as someone trying to increase my involvement,
not fully outsider, but not yet feeling fully belonging.

I'll try to explain what I mean taking this patch as example.
It started
as part of pglogical replication, but you split
it to ease review. But
this origin shows - in name, in comments,
in README. It's good example
of scratching itch - but without
connection to others, or maybe without
wider picture.
Don't get me wrong - having code to discuss is much
better
than just bikeshedding what we'd like to have.
And changes in v5
(like caching and passing tuples to hooks)
give hope for some work.
OTOH,
by looking at parallel queries, maybe once it lands
in repository it'll
get more attention?

I can feel your frustration. Coming to community without own
itch to scratch is also a bit frustrating - I do not know where
to start, what needs the most attention. I can see that
commitfests are in dire need for reviewers, so I started with
them. But at the same time I can only check whether code looks
correct, applies cleanly, whether it compiles, whether
tests pass.

I do not see bigger picture - and also cannot see emails with
discussion about long- or mid-term direction or vision.
It makes harder to feel that it matters and to decide
which patch look at.
Both communities I feel attached to (Debian and PostgreSQL)
differ from many highly visible FLOSS projects that they
do not have one backing company, nor benevolent dictator.
It gives them freedom to pursue different goals without
risk of disrupting power structure, but at the same time
it make it harder to connect the dots and see how project
is doing.

OK, we went quite far away from review. I do not have closing
remarks - only that I hope to provide better review by the weekend.

And let's discuss name - I do not fully like pglogical_decoding.

-- 
Tomasz Rybak  GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak




signature.asc
Description: This is a digitally signed message part


[HACKERS] Re: pglogical_output - a general purpose logical decoding output plugin

2016-01-03 Thread Tomasz Rybak
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:tested, failed

Applies cleanly on current master (b416c0bb622ce5d33fdbec3bbce00451132f10ec).

Builds without any problems on current Debian unstable (am64 arch, GCC 5.3.1-4, 
glibc 2.21-6)
There are 2 errors in tests, but they also occur on trunk build.
parallel group (20 tests):  json_encoding combocid portals_p2 advisory_lock 
tsdicts xmlmap equivclass guc functional_deps dependency json select_views 
cluster tsearch window jsonb indirect_toast bitmapops foreign_key foreign_data
 select_views ... FAILED
 portals_p2   ... ok
parallel group (2 tests):  create_view create_index
 create_index ... FAILED
 create_view  ... ok

README.md:
+Only one database is replicated, rather than the whole PostgreSQL install. A
[...]
+Unlike block-level ("physical") streaming replication, the change stream from
+the `pglogical` output plugin is compatible across different PostgreSQL
+versions and can even be consumed by non-PostgreSQL clients.
+
+Because logical decoding is used, only the changed rows are sent on the wire.
+There's no index change data, no vacuum activity, etc transmitted.
+
+The use of a replication slot means that the change stream is reliable and
+crash-safe. If

Isn't it feature of logical replication, not this particular plugin?
I'm not sure whether all that text needs to be repeated here.
OTOH this is good summary - so maybe just add links to base
documentation about replication, logical replication, slots, etc.

+# Usage
+
+The overall flow of client/server interaction is:
This part (flow) belongs to DESIGN.md, not to usage.

+* [Client issues `IDENTIFY_SYSTEM`
Is the [ needed here?

protocol.txt
Contains wrapped lines, but also very long lines. While long
lines make sense for tables, they should not occur in paragraphs, e.g. in
+== Arguments client supplies to output plugin
and following ones. It looks like parts of this file were formatted, and parts 
not.

In summary, documentation is mostly OK, and it describe plugin quite nicely.
The only thing I haven't fully understood is COPY-related - so it might be
good to extend a bit. And how COPY relates to JSON output format?

pglogical_output_plhooks.c
+#if PG_VERSION_NUM >= 90500
+   username = GetUserNameFromId(GetUserId(), false);
+#else
+   username = GetUserNameFromId(GetUserId());
+#endif

Is it needed? I mean - will tris module compiled for 9.4 (or earlier)
versions, or will it be 9.5 (or even 9.6)+ only?

pglogical_output.c
+   /*
+* Will we forward changesets? We have to if we're on 9.4;
+* otherwise honour the client's request.
+*/
+   if (PG_VERSION_NUM/100 == 904)
+   {
+   /*
+* 9.4 unconditionally forwards changesets due to lack 
of
+* replication origins, and it can't ever send origin 
info
+* for the same reason.
+*/

Similar case. In mail from 2015-11-12 (path v2) you removed v9.4 compatibility,
so I'm not sure whether checking for 9.4 or 9.5 makes any sense now.

This review focuses mostly on documentation, but I went through both
documentation and code. I understood most of the code (and it makes
sense after some cosideration :-) ), but I'm not proficient in PostgreSQL
to be fully sure that there are no hidden bugs.
At the same time - I haven't seen problems and suspicious fragments of code,
so after fixing mentioned problems it should go to the commiter.

Best regards.


The new status of this patch is: Waiting on Author


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pglogical_output - a general purpose logical decoding output plugin

2015-11-18 Thread Tomasz Rybak
W dniu 12.11.2015, czw o godzinie 22∶23 +0800, użytkownik Craig Ringer
napisał:
> Hi all
> 
> Here's an updated pglogical_output patch.
> 
> Selected changes since v1:
> 
>     - add json protocol output support
>     - fix encoding comparisons to use parsed encoding not string name
>     - import protocol documentation
>     - significantly expand pg_regress tests
>     - move pglogical_output_plhooks to a top-level contrib
>     - remove 9.4 compatibility

I've just skimmed through the patch.
As you removed 9.4 compatibility - are mentions of 9.4 and 9.5 
compatibility needed in README.md and README.plhooks?
It's not much text, but I'm not sure whether they shouldn't be
removed for 9.6-targeting change.

-- 
Tomasz Rybak  GPG/PGP key ID: 2AD5 9860
Fingerprint A481 824E 7DD3 9C0E C40A  488E C654 FB33 2AD5 9860
http://member.acm.org/~tomaszrybak




signature.asc
Description: This is a digitally signed message part