Re: [HACKERS] [PATCHES] replication docs: split single vs.

Hannu Krosing Thu, 16 Nov 2006 23:31:06 -0800

Ühel kenal päeval, R, 2006-11-17 kell 00:01, kirjutas Bruce Momjian:
> Markus Schiltknecht wrote:
> > Not mentioning that categorization doesn't help in clearing the 
> > confusion. Just look around, most people use these terms. They're used 
> > by MySQL and Oracle. Even Microsofts ActiveDirectory seems to have a 
> > multi-master operation mode.
> 
> OK.
> 
> > > For example, Slony is clearly single-master, 
> > 
> > Agreed.
> > 
> > > but
> > > what about data partitioning?  That is multi-master, in that there is
> > > more than one master, but only one master per data set.  
> > 
> > Data Partitioning is a way to work around the trouble of database 
> > replication in the application layer. Instead of trying to categorize it 
> > like a replication algorithm, we should explain that working around the 
> > trouble may be worthwhile in many cases.
> 
> OK.  I am still feeling that data partitioning is like master/slave
> replication because you have to get that read-only copy to the other
> server.  If you split things up so data sets resided on only one
> machine, you are right that would not be replication, but do people do
> that?  If so, it is almost another solution.


People do that in cases where there is high write loads ("high" as in
"not 10+ times less than reads") and just replicating the RO copies
would be prohibitively expensive in either network, cpu or memory terms.

pl/proxy is one tool for doing it. You can get latest stable version
from https://developer.skype.com/SkypeGarage/DbProjects . 

> > > And for
> > > multi-master, Oracle RAC is clearly multi master,
> > 
> > Yes.
> > 
> > >  and I can see pgpool
> > > as multi-master, or as several single-master systems, in that they
> > > operate independently.  
> > 
> > Several single-master systems? C'mon! Pgpool simply implements the most 
> > simplistic form of multi-master replication. 

In what way is pgpool multimaster ? last time I looked it did nothing
but applying DML to several databses. i.e. it is not replication at all,
or at least it is masterless, unless we think of the pgpool process
itself as the _single_ master :)

> Just because you can access 
> > the single databases inside the cluster doesn't make it less 
> > Multi-Master, does it?
> 
> OK, changed to "Multi-Master Replication Using Query Broadcasting".

I think this gives completely wrong picture of what pgpool does.

How about just "Query Broadcasting" ?

> > 
> > > After much thought, it seems that putting things
> > > into single/multi-master categories just adds more confusion, because
> > > several solutions just aren't clear
> > 
> > Agreed, I'm not saying you must categorize all solutions you describe. 
> > But please do categorize the ones which can be (and have so often been) 
> > categorized.
> 
> OK.
> 
> > > or fall into neither, e.g. Shared Disk Failover.
> > 
> > Oh, yes, this reminds me of Brad Nicholson's suggestion in [1] to add a 
> > warning "about the risk of having two postmaster come up...".
> 
> 
> Added.
> 
> > 
> > What about other means of sharing disks or filesystems? NBDs or even 
> > worse: NFS?
> 
> Added.
> 
> > 
> > > Another issue is that you mentioned heavly locking for
> > > multi-master, when in fact pgpool doesn't do any special inter-server
> > > locking, so it just doesn't apply.
> > 
> > Sure it does apply, in the sense that *every* single lock is granted and 
> > released on *every* node. The total amount of locks scales linearly with 
> > the amount of nodes in the cluster.
> 
> Uh, but the locks are the same on each machine as if it was a single
> server, while in a cluster, the locks are more intertwined with other
> things that are happening on the server, no?
> 
> > > In summary, it just seemed clearer to talk about each item and how it
> > > works, rather than try to categorize them.  The categorization just
> > > seems to do more harm than good.
> > > 
> > > Of course, I might be totally wrong, and am still looking for feedback,
> > > but these are my current thoughts.  Feedback?
> > 
> > AFAICT, the categorization in Single- and Multi-Master replication is 
> > very common. I think that's partly because it's focused on the solution. 
> > One can ask: do I want to write on all nodes or is a failover solution 
> > sufficient? Or can I probably get away with a read-only Slave?
> 
> OK.
> 
> > It's a categorization the user does, often before having a glimpse about 
> > how complicated database replication really is. Thus, IMO, it would make 
> > sense to help the user and allow him to quickly find answers. (And we 
> > can still tell them that it's not easy or even possible to categorize 
> > all the solutions.)
> > 
> > > I didn't mention distributed shared memory as a separate item because I
> > > felt it was an implementation detail of clustering, rather than
> > > something separate.  I kept two-phase in the cluster item for the same
> > > reason.
> > 
> > Why is pgpool not an implementation detail of clustering, then?
> > 
> > > Current version at:
> > > 
> > >   http://momjian.us/main/writings/pgsql/sgml/failover.html
> > 
> > That somehow doesn't work for me:
> 
> I lost power for a few hours.  I am back online.  I have updated the
> docs at that URL.  Please check and let me know.
> 
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] replication docs: split single vs.

Reply via email to