Re: [MLIST] Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

2002-12-15 Thread David Walker
Another concern I have with multi-master systems is what happens if the 
network splits in 2 so that 2 master systems are taking commits for 2 
separate sets of clients.  It seems to me that to re-sync the 2 databases 
upon the network healing would be a very complex task or impossible task.

On Sunday 15 December 2002 04:16 am, Al Sutton wrote:
 Many thanks for the explanation. Could you explain to me where the order or
 the writeset for the following scenario;

 If a tranasction takes 50ms to reach one database from another, for a
 specific data element (called X), the following timeline occurs

 at 0ms, T1(X) is written to system A.
 at 10ms, T2(X) is written to system B.

 Where T1(X) and T2(X) conflict.

 My concern is that if the Group Communication Daemon (gcd) is operating on
 each database,  a successful result for T1(X) will returned to the client
 talking to database A because T2(X) has not reached it, and thus no
 conflict is known about, and a sucessful result is returned to the client
 submitting T2(X) to database B because it is not aware of T1(X). This would
 mean that the two clients beleive bothe T1(X) and T2(X) completed
 succesfully, yet they can not due to the conflict.

 Thanks,

 Al.

 - Original Message -
 From: Darren Johnson [EMAIL PROTECTED]
 To: Al Sutton [EMAIL PROTECTED]
 Cc: Bruce Momjian [EMAIL PROTECTED]; Jan Wieck
 [EMAIL PROTECTED]; [EMAIL PROTECTED];
 PostgreSQL-development [EMAIL PROTECTED]
 Sent: Saturday, December 14, 2002 6:48 PM
 Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

  b) The Group Communication blob will consist of a number of processes

 which

  need to talk to all of the others to interrogate them for changes which

 may

  conflict with the current write that being handled and then issue the
  transaction response. This is basically the two phase commit solution

 with

  phases moved into the group communication process.
  
  I can see the possibility of using solution b and having less group
  communication processes than databases as attempt to simplify things,
   but this would mean the loss of a number of databases if the machine
   running

 the

  group communication process for the set of databases is lost.
 
  The group communication system doesn't just run on one system.  For
  postgres-r using spread
  there is actually a spread daemon that runs on each database server.  It
  has nothing to do with
  detecting the conflicts.  Its job is to deliver messages in a total
  order for writesets or simple order
  for commits, aborts, joins, etc.
 
  The detection of conflicts will be done at the database level, by a
  backend processes.  The basic
  concept is if all databases get the writesets (changes) in the exact
  same order, apply them in a
  consistent order, avoid conflicts, then one copy serialization is
  achieved.  (one copy of the database
  replicated across all databases in the replica)
 
  I hope that explains the group communication system's responsibility.
 
  Darren
 
 
 
 
 
 
  ---(end of broadcast)---
  TIP 5: Have you checked our extensive FAQ?
 
  http://www.postgresql.org/users-lounge/docs/faq.html

 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?

 http://archives.postgresql.org


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly



Re: [MLIST] Re: [mail] Re: [HACKERS] Big 7.4 items - Replication

2002-12-15 Thread Al Sutton
David,

This can be resolved by requiring that for any transaction to succeed the
entrypoint database must receive acknowlegements from n/2 + 0.5 (rounded up
to the nearest integer) databases where n is the total number in the
replicant set. The following cases are shown as an example;

Total Number of databases: 2
Number required to accept transaction: 2

Total Number of databases: 3
Number required to accept transaction: 2

Total Number of databases: 4
Number required to accept transaction: 3

Total Number of databases: 5
Number required to accept transaction: 3

Total Number of databases: 6
Number required to accept transaction: 4

Total Number of databases: 7
Number required to accept transaction: 4

Total Number of databases: 8
Number required to accept transaction: 5

This would prevent two replicant sub-sets forming, because it is impossible
for both sets to have over 50% of the databases.

Applications could be able to detect when a database has dropped out of the
replicant set because the database could report a state of Unable to obtain
majority consesus. This would allow applications differentiate between a
database out of the set where writing to other databases in the set could
yield a sucessful result, and Unable to commit due to conflict where
trying other databases is pointless.

Al

Example
- Original Message -
From: David Walker [EMAIL PROTECTED]
To: Al Sutton [EMAIL PROTECTED]; Darren Johnson
[EMAIL PROTECTED]
Cc: Bruce Momjian [EMAIL PROTECTED]; Jan Wieck
[EMAIL PROTECTED]; [EMAIL PROTECTED];
PostgreSQL-development [EMAIL PROTECTED]
Sent: Sunday, December 15, 2002 2:29 PM
Subject: Re: [MLIST] Re: [mail] Re: [HACKERS] Big 7.4 items - Replication


 Another concern I have with multi-master systems is what happens if the
 network splits in 2 so that 2 master systems are taking commits for 2
 separate sets of clients.  It seems to me that to re-sync the 2 databases
 upon the network healing would be a very complex task or impossible task.

 On Sunday 15 December 2002 04:16 am, Al Sutton wrote:
  Many thanks for the explanation. Could you explain to me where the order
or
  the writeset for the following scenario;
 
  If a tranasction takes 50ms to reach one database from another, for a
  specific data element (called X), the following timeline occurs
 
  at 0ms, T1(X) is written to system A.
  at 10ms, T2(X) is written to system B.
 
  Where T1(X) and T2(X) conflict.
 
  My concern is that if the Group Communication Daemon (gcd) is operating
on
  each database,  a successful result for T1(X) will returned to the
client
  talking to database A because T2(X) has not reached it, and thus no
  conflict is known about, and a sucessful result is returned to the
client
  submitting T2(X) to database B because it is not aware of T1(X). This
would
  mean that the two clients beleive bothe T1(X) and T2(X) completed
  succesfully, yet they can not due to the conflict.
 
  Thanks,
 
  Al.
 
  - Original Message -
  From: Darren Johnson [EMAIL PROTECTED]
  To: Al Sutton [EMAIL PROTECTED]
  Cc: Bruce Momjian [EMAIL PROTECTED]; Jan Wieck
  [EMAIL PROTECTED]; [EMAIL PROTECTED];
  PostgreSQL-development [EMAIL PROTECTED]
  Sent: Saturday, December 14, 2002 6:48 PM
  Subject: Re: [mail] Re: [HACKERS] Big 7.4 items - Replication
 
   b) The Group Communication blob will consist of a number of processes
 
  which
 
   need to talk to all of the others to interrogate them for changes
which
 
  may
 
   conflict with the current write that being handled and then issue the
   transaction response. This is basically the two phase commit solution
 
  with
 
   phases moved into the group communication process.
   
   I can see the possibility of using solution b and having less group
   communication processes than databases as attempt to simplify things,
but this would mean the loss of a number of databases if the machine
running
 
  the
 
   group communication process for the set of databases is lost.
  
   The group communication system doesn't just run on one system.  For
   postgres-r using spread
   there is actually a spread daemon that runs on each database server.
It
   has nothing to do with
   detecting the conflicts.  Its job is to deliver messages in a total
   order for writesets or simple order
   for commits, aborts, joins, etc.
  
   The detection of conflicts will be done at the database level, by a
   backend processes.  The basic
   concept is if all databases get the writesets (changes) in the exact
   same order, apply them in a
   consistent order, avoid conflicts, then one copy serialization is
   achieved.  (one copy of the database
   replicated across all databases in the replica)
  
   I hope that explains the group communication system's responsibility.
  
   Darren
  
  
  
  
  
  
   ---(end of
broadcast)---
   TIP 5: Have you checked our extensive FAQ?
  
   http://www.postgresql.org/users-lounge/docs/faq.html