On 13/08/16 17:34, Steve Singer wrote:
On 08/05/2016 11:00 AM, Petr Jelinek wrote:

as promised here is WIP version of logical replication patch.

Thanks for keeping on this.  This is important work

Feedback is welcome.

+<sect1 id="logical-replication-publication">
+  <title>Publication</title>
+  <para>
+    A Publication object can be defined on any master node, owned by one
+    user. A Publication is a set of changes generated from a group of
+    tables, and might also be described as a Change Set or Replication
+    Each Publication exists in only one database.

'A publication object can be defined on *any master node*'.  I found
this confusing the first time I read it because I thought it was
circular (what makes a node a 'master' node? Having a publication object
published from it?).   On reflection I realized that you mean ' any
*physical replication master*'.  I think this might be better worded as
'A publication object can be defined on any node other than a standby
node'.  I think referring to 'master' in the context of logical
replication might confuse people.

Makes sense to me.

I am raising this in the context of the larger terminology that we want
to use and potential confusion with the terminology we use for physical
replication. I like the publication / subscription terminology you've
gone with.

+    Publications are different from table schema and do not affect
+    how the table is accessed. Each table can be added to multiple
+    Publications if needed.  Publications may include both tables
+    and materialized views. Objects must be added explicitly, except
+    when a Publication is created for "ALL TABLES". There is no
+    default name for a Publication which specifies all tables.
+  </para>
+  <para>
+    The Publication is different from table schema, it does not affect
+    how the table is accessed and each table can be added to multiple

Those 2 paragraphs seem to start the same way.  I get the feeling that
there is some point your trying to express that I'm not catching onto.
Of course a publication is different than a tables schema, or different
than a function.

Ah that's relic of some editorialization, will fix. The reason why we think it's important to mention the difference between publication and schema is that they are the only objects that contain tables but they affect them in very different ways which might confuse users.

The definition of publication you have on the CREATE PUBLICATION page
seems better and should be repeated here (A publication is essentially a
group of tables intended for managing logical replication. See Section
30.1 <cid:part1.06040100.08080900@ssinger.info> for details about how
publications fit into logical replication setup. )

+  <para>
+    Conflicts happen when the replicated changes is breaking any
+    specified constraints (with the exception of foreign keys which are
+    not checked). Currently conflicts are not resolved automatically and
+    cause replication to be stopped with an error until the conflict is
+    manually resolved.

What options are there for manually resolving conflicts?  Is the only
option to change the data on the subscriber to avoid the conflict?
I assume there isn't a way to flag a particular row coming from the
publisher and say ignore it.  I don't think this is something we need to
support for the first version.

Yes you have to update data on subscriber or skip the the replication of whole transaction (for which the UI is not very friendly currently as you either have to consume the transaction pg_logical_slot_get_binary_changes or by moving origin on subscriber using pg_replication_origin_advance).

It's relatively easy to add some automatic conflict resolution as well, but it didn't seem absolutely necessary so I didn't do it for the initial version.

<sect1 id="logical-replication-architecture">
+  <title>Architecture</title>
+  <para>
+    Logical replication starts by copying a snapshot of the data on
+    the Provider database. Once that is done, the changes on Provider

I notice the user of 'Provider' above do you intend to update that to
'Publisher' or does provider mean something different. If we like the
'publication' terminology then I think 'publishers' should publish them
not providers.

Okay, I am just used to 'provider' in general (I guess londiste habit), but 'publisher' is fine as well.

I'm trying to test a basic subscription and I do the following

I did the following:

cluster 1:
create database test1;
create table a(id serial8 primary key,b text);
create publication testpub1;
 alter publication testpub1 add table a;
insert into a(b) values ('1');

create database test1;
create table a(id serial8 primary key,b text);
create subscription testsub2 publication testpub1 connection
'host=localhost port=5440 dbname=test1';
NOTICE:  created replication slot "testsub2" on provider
NOTICE:  synchronized table states


The initial sync completed okay, then I did

insert into a(b) values ('2');

but the second insert never replicated.

I had the following output

LOG:  terminating walsender process due to replication timeout

On cluster 1 I do

select * FROM pg_stat_replication;
 pid | usesysid | usename | application_name | client_addr |
client_hostname | client_port | backend_start |
backend_xmin | state | sent_location | write_location | flush_location |
replay_location | sync_priority | sy


(0 rows)

If I then kill  the cluster2 postmaster, I have to do a -9 or it won't die

That might explain why it didn't replicate. The wait loops in apply worker clearly need some work. Thanks for the report.

   Petr Jelinek                  http://www.2ndQuadrant.com/
   PostgreSQL Development, 24x7 Support, Training & Services

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to