Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-22 Thread konstantin knizhnik

On Mar 22, 2016, at 11:14 AM, Petr Jelinek wrote:
> 
> And each slot means connection with logical decoding attached to it so you 
> don't really want to have thousands of those anyway. I think you'll hit other 
> problems faster than loop over slots becomes problem if you plan to keep all 
> of them active.


Assume that cluster have thousands of nodes and we use sharding to scatter data 
through cluster nodes.
But to provide HA we want to perform sharding with some level of redundancy, 
for example save the same record at 3 different nodes.
Once possible approach (pg_shard) is to execute the same query at three 
different shards.
But there is no warranty that  result of execution will be the same at all 
nodes.
Alternative approach is to execute transaction at one node and then replicate 
it using logical replication to replicas.
So we do not perform logical replication to all 1000 nodes. Just to 2 of them. 
But each time it will be different pair of nodes. So we still need to have 1000 
active replication slots.

May be logical replication can not be used at all in such scenario - I have not 
thought much about it yet. Our first step will be multimaster without sharding.



> 
> -- 
>  Petr Jelinek  http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-22 Thread konstantin knizhnik

On Mar 22, 2016, at 10:10 AM, Craig Ringer wrote:

> On 22 March 2016 at 14:32, konstantin knizhnik  
> wrote:
>  
>> Ah you mean because with wal_log=true the origin advance is in different WAL 
>> record than commit? OK yeah you might be one transaction behind then, true.
> 
> It actually means that we can not enforce database consistency. If we do 
> replorigin_advance  before commit and then crash happen, then we will loose 
> some changes.
> If we call replorigin_advance after commit but crash happen before, then some 
> changes can be applied multiple times. For example we can insert some record 
> twice (if there are no unique constraints).
> Look likes the only working scenario is to setup replication session for each 
> commit and use locking to prevent concurrent session setup for the same slot 
> by multiple process,  doesn't it?
> 
> Yes.
> 
> How would you expect it to work if you attempted to replorigin_advance 
> without a session? From multiple concurrent backends?

I would not work. But I wonder why I would need to call replorigin_advance 
without a session.
Please excuse me, I am not thinking now about the general case of using logical 
replication, I just focused on multimaster.
What I need is some efficient, durable and atomic mechanism for applying 
changes.
I see only two ways to provide atomicity:
1. Tracking of origins should be done inside xact as part of normal commit.
2. Use custom WAL records.

1) is easier now and it really works if I correctly synchronize access to 
slots. And surprisingly it even doesn't add substantial overhead.

> 
> Parallel apply is complicated business. You have to make sure you apply xacts 
> in an order that's free from deadlocks and from insert/delete anomalies - 
> though you can at least detect those, ERROR that xact and all subsequent 
> ones, and retry.

Well, this is exactly what our multimaster does. We do not try to enforce order 
of applying xacts. But we detect global deadlocks and use 2PC to provide data 
consistency.
So it is not task of logical replication, it is done by DTM overriding  
visibility checks and transaction commit protocol using XTM.


> For progress tracking to be consistent and correct you'd have to make sure 
> you committed strictly in the same order as upstream. Just before each commit 
> you can set the origin LSN and advance the replication origin, which will 
> commit atomically along with the commit it confirms. I don't really see the 
> problem.

Sorry, I do not completely understand you. What you mean by "will commit 
atomically along with the commit it confirms"? How this atomicity will be 
enforced?

>  
> I have tried it, fortunately it doesn't cause any noticeable performance 
> degradation. But unfortunately  can't consider such approach as elegant.
> Why it is actually necessary to bind replication slot to process? Why it is 
> not possible to have multiple concurrent sessions for the same slot?
> 
> Especially since most slot changes LWLock- and/or spinlock-protected already.
> 
> The client would have to manage replay confirmations appropriately so that it 
> doesn't confirm past the point where some other connection still needs it.
> 
> We'd have to expose a "slot" column in pg_stat_replication and remove the 
> "pid" column from pg_replication_slots to handle the 1:n relationship between 
> slot clients and slots, and it'd be a pain to show which normal user backends 
> were using a slot. Not really sure how to handle that.
> 
> To actually make this useful would require a lot more though. A way to 
> request that replay start from a new LSN without a full disconnect/reconnect 
> each time. Client-side parallel consume/apply. Inter-transaction ordering 
> information so the client can work out a viable xact apply order (possibly 
> using SSI information per the discussion with Kevin?). Etc.
> 
> I haven't really looked into this and I suspect there are some hairy areas 
> involved in replaying a slot from more than one client. The reason I'm 
> interested in it personally is for initial replica state setup as Oleksandr 
> prototyped and described earlier. We could attach to the slot's initial 
> snapshot then issue a new replication command that, given a table name or 
> oid, scans the table from the snapshot and passes each tuple to a new 
> callback (like, but not the same as, the insert callback) on the output 
> plugin.
> 
> That way clients could parallel-copy the initial state of the DB across the 
> same replication protocol they then consume new changes from, with no need to 
> make normal libpq connections and COPY initial state.
> 
> I'm interested in being able to do parallel receive of new changes from the 
> slot too, but suspect that'd be a bunch harder.
> 
>   
> Also I concern about using sequential search for slot location in 
> replorigin_session_setup and many other functions - there is loop through all 
>   max_replication_slots.
> It seems to be not a 

Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-22 Thread Petr Jelinek

On 22/03/16 07:32, konstantin knizhnik wrote:


On Mar 21, 2016, at 4:30 PM, Petr Jelinek wrote:


On 21/03/16 14:25, Andres Freund wrote:

On 2016-03-21 14:18:27 +0100, Petr Jelinek wrote:

On 21/03/16 14:15, Andres Freund wrote:

Only when the origin is actually setup for the current session. You
need
to call the replorigin_advance yourself from your apply code.


That's problematic from a durability POV.



Huh? How come?


If you use the session mechanism the replication progress is synced with
the apply process, even if there are crashes. Crash recovery updates the
progress.  There's no such interlock with apply otherwise, and I don't
see how you can build one with reasonable effort.



Ah you mean because with wal_log=true the origin advance is in
different WAL record than commit? OK yeah you might be one transaction
behind then, true.


It actually means that we can not enforce database consistency. If we do
replorigin_advance  before commit and then crash happen, then we will
loose some changes.
If we call replorigin_advance after commit but crash happen before, then
some changes can be applied multiple times. For example we can insert
some record twice (if there are no unique constraints).
Look likes the only working scenario is to setup replication session for
each commit and use locking to prevent concurrent session setup for the
same slot by multiple process,  doesn't it?


You can do that, or you can move the tracking to the receiving process 
and spill the data to the disk (hurts IO obviously), or save the 
progress to table (also hurts IO), or write patch which solves this (no 
idea how though).




Also I concern about using sequential search for slot location
in replorigin_session_setup and many other functions - there is loop
through all max_replication_slots.
It seems to be not a problem when number of slots is less than 10. For
multimaster this assumption is true - even Oracle RAC rarely has
two-digit number of nodes.
But if we want to perform sharding and use logical replication for
providing redundancy, then number of nodes and slots can be essentially
larger.
I didn't think much about such configuration - may be it possible to
propose more efficient mechanism for replication in this case.



And each slot means connection with logical decoding attached to it so 
you don't really want to have thousands of those anyway. I think you'll 
hit other problems faster than loop over slots becomes problem if you 
plan to keep all of them active.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-22 Thread Craig Ringer
On 22 March 2016 at 14:32, konstantin knizhnik 
wrote:


> Ah you mean because with wal_log=true the origin advance is in different
> WAL record than commit? OK yeah you might be one transaction behind then,
> true.
>
>
> It actually means that we can not enforce database consistency. If we do 
> replorigin_advance
>  before commit and then crash happen, then we will loose some changes.
> If we call replorigin_advance after commit but crash happen before, then
> some changes can be applied multiple times. For example we can insert some
> record twice (if there are no unique constraints).
> Look likes the only working scenario is to setup replication session for
> each commit and use locking to prevent concurrent session setup for the
> same slot by multiple process,  doesn't it?
>

Yes.

How would you expect it to work if you attempted to replorigin_advance
without a session? From multiple concurrent backends?

Parallel apply is complicated business. You have to make sure you apply
xacts in an order that's free from deadlocks and from insert/delete
anomalies - though you can at least detect those, ERROR that xact and all
subsequent ones, and retry. For progress tracking to be consistent and
correct you'd have to make sure you committed strictly in the same order as
upstream. Just before each commit you can set the origin LSN and advance
the replication origin, which will commit atomically along with the commit
it confirms. I don't really see the problem.


> I have tried it, fortunately it doesn't cause any noticeable performance
> degradation. But unfortunately  can't consider such approach as elegant.
> Why it is actually necessary to bind replication slot to process? Why it
> is not possible to have multiple concurrent sessions for the same slot?
>

Especially since most slot changes LWLock- and/or spinlock-protected
already.

The client would have to manage replay confirmations appropriately so that
it doesn't confirm past the point where some other connection still needs
it.

We'd have to expose a "slot" column in pg_stat_replication and remove the
"pid" column from pg_replication_slots to handle the 1:n relationship
between slot clients and slots, and it'd be a pain to show which normal
user backends were using a slot. Not really sure how to handle that.

To actually make this useful would require a lot more though. A way to
request that replay start from a new LSN without a full
disconnect/reconnect each time. Client-side parallel consume/apply.
Inter-transaction ordering information so the client can work out a viable
xact apply order (possibly using SSI information per the discussion with
Kevin?). Etc.

I haven't really looked into this and I suspect there are some hairy areas
involved in replaying a slot from more than one client. The reason I'm
interested in it personally is for initial replica state setup as Oleksandr
prototyped and described earlier. We could attach to the slot's initial
snapshot then issue a new replication command that, given a table name or
oid, scans the table from the snapshot and passes each tuple to a new
callback (like, but not the same as, the insert callback) on the output
plugin.

That way clients could parallel-copy the initial state of the DB across the
same replication protocol they then consume new changes from, with no need
to make normal libpq connections and COPY initial state.

I'm interested in being able to do parallel receive of new changes from the
slot too, but suspect that'd be a bunch harder.



> Also I concern about using sequential search for slot location
> in replorigin_session_setup and many other functions - there is loop
> through all  max_replication_slots.
> It seems to be not a problem when number of slots is less than 10. For
> multimaster this assumption is true - even Oracle RAC rarely has two-digit
> number of nodes.
> But if we want to perform sharding and use logical replication for
> providing redundancy, then number of nodes and slots can be essentially
> larger.
>

Sounds like premature optimisation. Deal with it if it comes up in profiles
in scale testing with 100 clients. I'll be surprised if it does.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-22 Thread konstantin knizhnik

On Mar 21, 2016, at 4:30 PM, Petr Jelinek wrote:

> On 21/03/16 14:25, Andres Freund wrote:
>> On 2016-03-21 14:18:27 +0100, Petr Jelinek wrote:
>>> On 21/03/16 14:15, Andres Freund wrote:
> Only when the origin is actually setup for the current session. You
> need
> to call the replorigin_advance yourself from your apply code.
 
 That's problematic from a durability POV.
 
>>> 
>>> Huh? How come?
>> 
>> If you use the session mechanism the replication progress is synced with
>> the apply process, even if there are crashes. Crash recovery updates the
>> progress.  There's no such interlock with apply otherwise, and I don't
>> see how you can build one with reasonable effort.
>> 
> 
> Ah you mean because with wal_log=true the origin advance is in different WAL 
> record than commit? OK yeah you might be one transaction behind then, true.

It actually means that we can not enforce database consistency. If we do 
replorigin_advance  before commit and then crash happen, then we will loose 
some changes.
If we call replorigin_advance after commit but crash happen before, then some 
changes can be applied multiple times. For example we can insert some record 
twice (if there are no unique constraints).
Look likes the only working scenario is to setup replication session for each 
commit and use locking to prevent concurrent session setup for the same slot by 
multiple process,  doesn't it?
I have tried it, fortunately it doesn't cause any noticeable performance 
degradation. But unfortunately  can't consider such approach as elegant.
Why it is actually necessary to bind replication slot to process? Why it is not 
possible to have multiple concurrent sessions for the same slot?

Also I concern about using sequential search for slot location in 
replorigin_session_setup and many other functions - there is loop through all   
  max_replication_slots.
It seems to be not a problem when number of slots is less than 10. For 
multimaster this assumption is true - even Oracle RAC rarely has two-digit 
number of nodes.
But if we want to perform sharding and use logical replication for providing 
redundancy, then number of nodes and slots can be essentially larger.
I didn't think much about such configuration - may be it possible to propose 
more efficient mechanism for replication in this case.






> 
> -- 
>  Petr Jelinek  http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-21 Thread Petr Jelinek

On 21/03/16 14:25, Andres Freund wrote:

On 2016-03-21 14:18:27 +0100, Petr Jelinek wrote:

On 21/03/16 14:15, Andres Freund wrote:

Only when the origin is actually setup for the current session. You
need
to call the replorigin_advance yourself from your apply code.


That's problematic from a durability POV.



Huh? How come?


If you use the session mechanism the replication progress is synced with
the apply process, even if there are crashes. Crash recovery updates the
progress.  There's no such interlock with apply otherwise, and I don't
see how you can build one with reasonable effort.



Ah you mean because with wal_log=true the origin advance is in different 
WAL record than commit? OK yeah you might be one transaction behind 
then, true.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-21 Thread Andres Freund
On 2016-03-21 14:18:27 +0100, Petr Jelinek wrote:
> On 21/03/16 14:15, Andres Freund wrote:
> >>Only when the origin is actually setup for the current session. You
> >>need
> >>to call the replorigin_advance yourself from your apply code.
> >
> >That's problematic from a durability POV.
> >
> 
> Huh? How come?

If you use the session mechanism the replication progress is synced with
the apply process, even if there are crashes. Crash recovery updates the
progress.  There's no such interlock with apply otherwise, and I don't
see how you can build one with reasonable effort.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-21 Thread Petr Jelinek



On 21/03/16 14:15, Andres Freund wrote:



On March 21, 2016 2:08:54 PM GMT+01:00, Petr Jelinek  
wrote:

On 21/03/16 13:44, Konstantin Knizhnik wrote:



On 21.03.2016 15:10, Petr Jelinek wrote:

Hi,

On 19/03/16 11:46, Konstantin Knizhnik wrote:

Hi,

I am trying to use logical replication mechanism in implementation

of

PostgreSQL multimaster and faced with one conceptual problem.
Originally logical replication was intended to support asynchronous
replication. In this case applying changes by single process should

not

be a bottleneck.
But if we are using distributed transaction manager to provide

global

consistency, then applying transaction by one worker  leads to very

bad

performance and what is worser: cause unintended serialization of
transactions, which is not taken in account by distributed deadlock
detection algorithm and so can cause
undetected deadlocks.

So I have implemented pool of background workers which can apply
transactions concurrently.
It works and shows acceptable performance. But now I am thinking

about

HA and tracking origin LSNs which are needed to correctly specify

slot

position in case of recovery. And there is a problem: as far as I
understand to correctly record origin LSN in WAL and advance slot
position it is necessary to setup session
using replorigin_session_setup. It is not so convenient in case of

using

pool of background workers, because we have to setup session for

each

commit.
But the main problem is that for each slot session can be

associated

only with one process:

  else if (curstate->acquired_by != 0)
  {
  ereport(ERROR,
  (errcode(ERRCODE_OBJECT_IN_USE),
   errmsg("replication identifier %d is already active

for

PID %d",
  curstate->roident, curstate->acquired_by)));
  }

Which once again means that there can be only one process applying
changes.



That's not true, all it means is that you can do
replorigin_session_setup for same origin only in one process but you
don't need to have it setup for session to update it, the
replorigin_advance() works just fine.


But RecordTransactionCommit is using replorigin_session_advance, not
replorigin_advance.


Only when the origin is actually setup for the current session. You
need
to call the replorigin_advance yourself from your apply code.


That's problematic from a durability POV.



Huh? How come?

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-21 Thread Andres Freund


On March 21, 2016 2:08:54 PM GMT+01:00, Petr Jelinek  
wrote:
>On 21/03/16 13:44, Konstantin Knizhnik wrote:
>>
>>
>> On 21.03.2016 15:10, Petr Jelinek wrote:
>>> Hi,
>>>
>>> On 19/03/16 11:46, Konstantin Knizhnik wrote:
 Hi,

 I am trying to use logical replication mechanism in implementation
>of
 PostgreSQL multimaster and faced with one conceptual problem.
 Originally logical replication was intended to support asynchronous
 replication. In this case applying changes by single process should
>not
 be a bottleneck.
 But if we are using distributed transaction manager to provide
>global
 consistency, then applying transaction by one worker  leads to very
>bad
 performance and what is worser: cause unintended serialization of
 transactions, which is not taken in account by distributed deadlock
 detection algorithm and so can cause
 undetected deadlocks.

 So I have implemented pool of background workers which can apply
 transactions concurrently.
 It works and shows acceptable performance. But now I am thinking
>about
 HA and tracking origin LSNs which are needed to correctly specify
>slot
 position in case of recovery. And there is a problem: as far as I
 understand to correctly record origin LSN in WAL and advance slot
 position it is necessary to setup session
 using replorigin_session_setup. It is not so convenient in case of
>using
 pool of background workers, because we have to setup session for
>each
 commit.
 But the main problem is that for each slot session can be
>associated
 only with one process:

  else if (curstate->acquired_by != 0)
  {
  ereport(ERROR,
  (errcode(ERRCODE_OBJECT_IN_USE),
   errmsg("replication identifier %d is already active
>for
 PID %d",
  curstate->roident, curstate->acquired_by)));
  }

 Which once again means that there can be only one process applying
 changes.

>>>
>>> That's not true, all it means is that you can do
>>> replorigin_session_setup for same origin only in one process but you
>>> don't need to have it setup for session to update it, the
>>> replorigin_advance() works just fine.
>>
>> But RecordTransactionCommit is using replorigin_session_advance, not
>> replorigin_advance.
>
>Only when the origin is actually setup for the current session. You
>need 
>to call the replorigin_advance yourself from your apply code.

That's problematic from a durability POV.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-21 Thread Petr Jelinek

On 21/03/16 13:44, Konstantin Knizhnik wrote:



On 21.03.2016 15:10, Petr Jelinek wrote:

Hi,

On 19/03/16 11:46, Konstantin Knizhnik wrote:

Hi,

I am trying to use logical replication mechanism in implementation of
PostgreSQL multimaster and faced with one conceptual problem.
Originally logical replication was intended to support asynchronous
replication. In this case applying changes by single process should not
be a bottleneck.
But if we are using distributed transaction manager to provide global
consistency, then applying transaction by one worker  leads to very bad
performance and what is worser: cause unintended serialization of
transactions, which is not taken in account by distributed deadlock
detection algorithm and so can cause
undetected deadlocks.

So I have implemented pool of background workers which can apply
transactions concurrently.
It works and shows acceptable performance. But now I am thinking about
HA and tracking origin LSNs which are needed to correctly specify slot
position in case of recovery. And there is a problem: as far as I
understand to correctly record origin LSN in WAL and advance slot
position it is necessary to setup session
using replorigin_session_setup. It is not so convenient in case of using
pool of background workers, because we have to setup session for each
commit.
But the main problem is that for each slot session can be associated
only with one process:

 else if (curstate->acquired_by != 0)
 {
 ereport(ERROR,
 (errcode(ERRCODE_OBJECT_IN_USE),
  errmsg("replication identifier %d is already active for
PID %d",
 curstate->roident, curstate->acquired_by)));
 }

Which once again means that there can be only one process applying
changes.



That's not true, all it means is that you can do
replorigin_session_setup for same origin only in one process but you
don't need to have it setup for session to update it, the
replorigin_advance() works just fine.


But RecordTransactionCommit is using replorigin_session_advance, not
replorigin_advance.


Only when the origin is actually setup for the current session. You need 
to call the replorigin_advance yourself from your apply code.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-21 Thread Konstantin Knizhnik



On 21.03.2016 15:10, Petr Jelinek wrote:

Hi,

On 19/03/16 11:46, Konstantin Knizhnik wrote:

Hi,

I am trying to use logical replication mechanism in implementation of
PostgreSQL multimaster and faced with one conceptual problem.
Originally logical replication was intended to support asynchronous
replication. In this case applying changes by single process should not
be a bottleneck.
But if we are using distributed transaction manager to provide global
consistency, then applying transaction by one worker  leads to very bad
performance and what is worser: cause unintended serialization of
transactions, which is not taken in account by distributed deadlock
detection algorithm and so can cause
undetected deadlocks.

So I have implemented pool of background workers which can apply
transactions concurrently.
It works and shows acceptable performance. But now I am thinking about
HA and tracking origin LSNs which are needed to correctly specify slot
position in case of recovery. And there is a problem: as far as I
understand to correctly record origin LSN in WAL and advance slot
position it is necessary to setup session
using replorigin_session_setup. It is not so convenient in case of using
pool of background workers, because we have to setup session for each
commit.
But the main problem is that for each slot session can be associated
only with one process:

 else if (curstate->acquired_by != 0)
 {
 ereport(ERROR,
 (errcode(ERRCODE_OBJECT_IN_USE),
  errmsg("replication identifier %d is already active for
PID %d",
 curstate->roident, curstate->acquired_by)));
 }

Which once again means that there can be only one process applying 
changes.




That's not true, all it means is that you can do 
replorigin_session_setup for same origin only in one process but you 
don't need to have it setup for session to update it, the 
replorigin_advance() works just fine.


But RecordTransactionCommit is using replorigin_session_advance, not 
replorigin_advance.

And replorigin_session_advance requires that session was setup:

void
replorigin_session_advance(XLogRecPtr remote_commit, XLogRecPtr 
local_commit)

{
Assert(session_replication_state != NULL);
}

"session_replication_state" is private variable which is set by 
replorigin_session_setup.
But attempt to call replorigin_session_setup from multiple process cause 
above error.


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-21 Thread Petr Jelinek

Hi,

On 19/03/16 11:46, Konstantin Knizhnik wrote:

Hi,

I am trying to use logical replication mechanism in implementation of
PostgreSQL multimaster and faced with one conceptual problem.
Originally logical replication was intended to support asynchronous
replication. In this case applying changes by single process should not
be a bottleneck.
But if we are using distributed transaction manager to provide global
consistency, then applying transaction by one worker  leads to very bad
performance and what is worser: cause unintended serialization of
transactions, which is not taken in account by distributed deadlock
detection algorithm and so can cause
undetected deadlocks.

So I have implemented pool of background workers which can apply
transactions concurrently.
It works and shows acceptable performance. But now I am thinking about
HA and tracking origin LSNs which are needed to correctly specify slot
position in case of recovery. And there is a problem: as far as I
understand to correctly record origin LSN in WAL and advance slot
position it is necessary to setup session
using replorigin_session_setup. It is not so convenient in case of using
pool of background workers, because we have to setup session for each
commit.
But the main problem is that for each slot session can be associated
only with one process:

 else if (curstate->acquired_by != 0)
 {
 ereport(ERROR,
 (errcode(ERRCODE_OBJECT_IN_USE),
  errmsg("replication identifier %d is already active for
PID %d",
 curstate->roident, curstate->acquired_by)));
 }

Which once again means that there can be only one process applying changes.



That's not true, all it means is that you can do 
replorigin_session_setup for same origin only in one process but you 
don't need to have it setup for session to update it, the 
replorigin_advance() works just fine.



--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Applying logical replication changes by more than one process

2016-03-21 Thread Petr Jelinek

Hi,

On 19/03/16 11:46, Konstantin Knizhnik wrote:

Hi,

I am trying to use logical replication mechanism in implementation of
PostgreSQL multimaster and faced with one conceptual problem.
Originally logical replication was intended to support asynchronous
replication. In this case applying changes by single process should not
be a bottleneck.
But if we are using distributed transaction manager to provide global
consistency, then applying transaction by one worker  leads to very bad
performance and what is worser: cause unintended serialization of
transactions, which is not taken in account by distributed deadlock
detection algorithm and so can cause
undetected deadlocks.

So I have implemented pool of background workers which can apply
transactions concurrently.
It works and shows acceptable performance. But now I am thinking about
HA and tracking origin LSNs which are needed to correctly specify slot
position in case of recovery. And there is a problem: as far as I
understand to correctly record origin LSN in WAL and advance slot
position it is necessary to setup session
using replorigin_session_setup. It is not so convenient in case of using
pool of background workers, because we have to setup session for each
commit.
But the main problem is that for each slot session can be associated
only with one process:

 else if (curstate->acquired_by != 0)
 {
 ereport(ERROR,
 (errcode(ERRCODE_OBJECT_IN_USE),
  errmsg("replication identifier %d is already active for
PID %d",
 curstate->roident, curstate->acquired_by)));
 }

Which once again means that there can be only one process applying changes.



That's not true, all it means is that you can do 
replorigin_session_setup for same origin only in one process but you 
don't need to have it setup for session to update it, the 
replorigin_advance() works just fine.



--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Applying logical replication changes by more than one process

2016-03-19 Thread Konstantin Knizhnik

Hi,

I am trying to use logical replication mechanism in implementation of 
PostgreSQL multimaster and faced with one conceptual problem.
Originally logical replication was intended to support asynchronous 
replication. In this case applying changes by single process should not be a 
bottleneck.
But if we are using distributed transaction manager to provide global consistency, then applying transaction by one worker  leads to very bad performance and what is worser: cause unintended serialization of transactions, which is not taken in account by 
distributed deadlock detection algorithm and so can cause

undetected deadlocks.

So I have implemented pool of background workers which can apply transactions 
concurrently.
It works and shows acceptable performance. But now I am thinking about HA and tracking origin LSNs which are needed to correctly specify slot position in case of recovery. And there is a problem: as far as I understand to correctly record origin LSN in WAL 
and advance slot position it is necessary to setup session

using replorigin_session_setup. It is not so convenient in case of using pool 
of background workers, because we have to setup session for each commit.
But the main problem is that for each slot session can be associated only with 
one process:

else if (curstate->acquired_by != 0)
{
ereport(ERROR,
(errcode(ERRCODE_OBJECT_IN_USE),
 errmsg("replication identifier %d is already active for PID %d",
curstate->roident, curstate->acquired_by)));
}

Which once again means that there can be only one process applying changes.

To provide correct state of replication node it is necessary to enforce that each logical replication record is replayed exactly once: we should not loose some change or try to apply it twice. So operation of recording original LSN position in WAL and 
adjusting slot should be atomic. And during recovery we should restore slot current position based on the origin values  extracted from WAL. I wonder if it can be done using current logical replication mechanism when changes of each slot are applied by 
more than one process? Or the only alternative is to write/read origin LSNs in WAL myself, for example using custom WAL records?


Thanks in advance!

--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers