Re: [DOCS] High Availability, Load Balancing, and Replication Feature Matrix

2007-11-10 Thread Markus Schiltknecht

Hello Bruce,

Bruce Momjian wrote:

I have added a High Availability, Load Balancing, and Replication
Feature Matrix table to the docs:


Nice work. I appreciate your efforts in clearing up the uncertainty that 
surrounds this topic.


As you might have guessed, I have some complaints regarding the Feature 
Matrix. I hope this won't discourage you, but I'd rather like to 
contribute to an improved variant.


First of all, I don't quite like the negated formulations. I can see 
that you want a dot to mark a positive feature, but I find it hard to 
understand.


I'm especially puzzled about is the "master never locks others". All 
first four, namely "shared disk failover", "file system replication", 
"warm standby" and "master slave replication", block others (the slaves) 
completely, which is about the worst kind of lock.


Comparing between "File System Replication" and "Shared Disk Failover", 
you state that the former has "master server overhead", while the later 
doesn't. Seen solely from the single server node, this might be true. 
But summarized over the cluster, you have a network with a quite similar 
load in both cases. I wouldn't say one has less overhead than the other 
per definition.


Then, you are mixing apples and oranges. Why should a "statement based 
replication solution" not require conflict resolution? You can build 
eager as well as lazy statement based replication solutions, that does 
not have anything to do with the other, does it?


Same applies to "master slave replication" and "per table granularity".

And in the special case of (async, but eager) Postgres-R also to "async 
multi-master replication" and "no conflict resolution necessary". 
Although I can understand that that's a pretty nifty difference.


Given the matrix focuses on practically available solutions, I can see 
some value in it. But from a more theoretical viewpoint, I find it 
pretty confusing. Now, if you want a practically usable feature 
comparison table, I'd strongly vote for clearly mentioning the products 
you have in mind - otherwise the table pretends to be something it is not.


If it should be theoretically correct without mentioning available 
solutions, I'd rather vote for explaining the terms and concepts.


To clarify my viewpoint, I'll quickly go over the features you're 
mentioning and associate them with the concepts, as I understand them.


 - special hardware:  always nice, not much theoretical effect, a
  network is a network, storage is storage.

 - multiple masters:  that's what single- vs multi masters is about:
  writing transactions. Can be mixed with
  eager/lazy, every combination makes
  sense for certain applications.

 - overhead:  replication per definition generates overhead,
  question is: how much, and where.

 - locking of others: again, question of how much and how fine grained
  the locking is. In a single master repl. sol., the
  slaves are locked completely. In lazy repl. sol.,
  the locking is deferred until after the commit,
  during conflict resolution. In eager repl. sol.,
  the locking needs to take place before the commit.
  But all replication systems need some kind of
  locks!

 - data loss on fail: solely dependent on eager/lazy. (Given a real
  replication, with a replica, which shared storage
  does not provide, IMO)

 - slaves read only:  theoretically possible with all replication
  system, are they lazy/eager, single-/multi-
  master. That we are unable to read from slave
  nodes is an implementation annoyance of
  Postgres, if you want.

 - per table gran.:   again, independent of lazy/eager, single-/multi.
  Depends solely on the level where data is
  replicated: block device, file system, statement,
  WAL or other internal format.

 - conflict resol.:   in multi master systems, that depends on the
  lazy/eager property. Single master systems
  obviously never need to resolve conflicts.

IMO, "data partitioning" is entirely perpendicular to replication. It 
can be combined, in various ways. There's horizontal and vertical 
partitioning, eager/lazy and single-/multi-master replication. I guess 
we could find a use case for most of the combinations thereof. (Kudos 
for finding a combination which definitely has no use case).


Well, these are my theories, do with it whatever you like. Comments 
appreciated.


Kind regards

Markus


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [DOCS] High Availability, Load Balancing, and Replication Feature Matrix

2007-11-10 Thread Bruce Momjian
Markus Schiltknecht wrote:
> Hello Bruce,
> 
> Bruce Momjian wrote:
> > I have added a High Availability, Load Balancing, and Replication
> > Feature Matrix table to the docs:
> 
> Nice work. I appreciate your efforts in clearing up the uncertainty that 
> surrounds this topic.
> 
> As you might have guessed, I have some complaints regarding the Feature 
> Matrix. I hope this won't discourage you, but I'd rather like to 
> contribute to an improved variant.

Not sure if you were around when we wrote this chapter but there was a
lot of good discussion to get it to where it is now.

> First of all, I don't quite like the negated formulations. I can see 
> that you want a dot to mark a positive feature, but I find it hard to 
> understand.

Well, the idea is to say "what things do I want and what offers it?"  If
you have positive/negative it makes it harder to do that.  I realize it
is confusing in a different way.  We could split out the negatives into
a different table but that seems worse.

> I'm especially puzzled about is the "master never locks others". All 
> first four, namely "shared disk failover", "file system replication", 
> "warm standby" and "master slave replication", block others (the slaves) 
> completely, which is about the worst kind of lock.

That item assumes you have slaves that are trying to do work.  The point
is that multi-master slows down the other slaves in a way no other
option does, which is the reason we don't support it yet.  I have
updated the wording to "No inter-server locking delay".

> Comparing between "File System Replication" and "Shared Disk Failover", 
> you state that the former has "master server overhead", while the later 
> doesn't. Seen solely from the single server node, this might be true. 
> But summarized over the cluster, you have a network with a quite similar 
> load in both cases. I wouldn't say one has less overhead than the other 
> per definition.

The point is that file system replication has to wait for the standby
server to write the blocks, while disk failover does not.  I don't think
the network is an issue considering many use NAS anyway.

> Then, you are mixing apples and oranges. Why should a "statement based 
> replication solution" not require conflict resolution? You can build 
> eager as well as lazy statement based replication solutions, that does 
> not have anything to do with the other, does it?

There is no dot there so I am saying "statement based replication
solution" requires conflict resolution.  Agreed you could do it without
conflict resolution and it is kind of independent.  How should we deal
with this?

> Same applies to "master slave replication" and "per table granularity".

I tried to mark them based on existing or typical solutions, but you are
right, especially if the master/slave is not PITR based.  Some can't do
per-table, like disk failover.

> And in the special case of (async, but eager) Postgres-R also to "async 
> multi-master replication" and "no conflict resolution necessary". 
> Although I can understand that that's a pretty nifty difference.

Yea, the table isn't going to be 100% but tries to summarize what in the
section above.

> Given the matrix focuses on practically available solutions, I can see 
> some value in it. But from a more theoretical viewpoint, I find it 
> pretty confusing. Now, if you want a practically usable feature 
> comparison table, I'd strongly vote for clearly mentioning the products 
> you have in mind - otherwise the table pretends to be something it is not.

I considered that and I can add something that says you have to consider
the text above for more details.  Some require solution mentions, Slony,
while others do not, like disk failover.

> If it should be theoretically correct without mentioning available 
> solutions, I'd rather vote for explaining the terms and concepts.
> 
> To clarify my viewpoint, I'll quickly go over the features you're 
> mentioning and associate them with the concepts, as I understand them.
> 
>   - special hardware:  always nice, not much theoretical effect, a
>network is a network, storage is storage.
> 
>   - multiple masters:  that's what single- vs multi masters is about:
>writing transactions. Can be mixed with
>eager/lazy, every combination makes
>sense for certain applications.
> 
>   - overhead:  replication per definition generates overhead,
>question is: how much, and where.
> 
>   - locking of others: again, question of how much and how fine grained
>the locking is. In a single master repl. sol., the
>slaves are locked completely. In lazy repl. sol.,
>the locking is deferred until after the commit,
>during conflict resolution. In eager repl. sol.,
>the locking needs to take place before the commit

[DOCS] Placement of contrib modules in SGML documentation

2007-11-10 Thread Tom Lane
I am still desperately unhappy with the choice to put the contrib docs
where they were put.  They are by no stretch of the imagination part of
the "SQL Language", and there is no defense for having inserted them
into the middle of the part, in front of substantially more widely
interesting information such as concurrency control.

Furthermore, labeling them "Standard Modules" is somebody's flight of
wishful thinking --- if they were installed by default, they'd deserve
such a title, but that's not happening any time soon.

I think there's a case for putting these pages under Part V Server
Programming (though a few are not in fact server-side code), or under
Part VI Reference (ignoring the fact that most of the text isn't in a
uniform reference-page style ... though maybe we could plan to work
towards that) or under Appendixes (though I'm sure there are people
who will complain about that because their private agenda is to make
these things as prominent as possible).  Or we could make them a new
top-level Part, probably just after Reference.

As for the title, how about "Available Add-On Modules", or something
like that?

BTW, why are neither contrib/dblink nor contrib/spi included in the
conversion?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [DOCS] Placement of contrib modules in SGML documentation

2007-11-10 Thread Bruce Momjian
Tom Lane wrote:
> I am still desperately unhappy with the choice to put the contrib docs
> where they were put.  They are by no stretch of the imagination part of
> the "SQL Language", and there is no defense for having inserted them
> into the middle of the part, in front of substantially more widely
> interesting information such as concurrency control.

I think we need to decide where they will go;  they are easy to move.

> Furthermore, labeling them "Standard Modules" is somebody's flight of
> wishful thinking --- if they were installed by default, they'd deserve
> such a title, but that's not happening any time soon.

That name needs adjustment too.

> I think there's a case for putting these pages under Part V Server
> Programming (though a few are not in fact server-side code), or under
> Part VI Reference (ignoring the fact that most of the text isn't in a
> uniform reference-page style ... though maybe we could plan to work
> towards that) or under Appendixes (though I'm sure there are people
> who will complain about that because their private agenda is to make
> these things as prominent as possible).  Or we could make them a new
> top-level Part, probably just after Reference.

I think appendix is the right place myself.

> As for the title, how about "Available Add-On Modules", or something
> like that?

Yea, that is better.  Someone didn't want "contrib" mentioned in the
title.  The problem with "Available" is that it doesn't include
pgfoundry stuff which is _available_ too, just not shipped.

> BTW, why are neither contrib/dblink nor contrib/spi included in the
> conversion?

I see dblink:

http://momjian.us/main/writings/pgsql/sgml/dblink.html

I assume spi wasn't done because it is just examples of SPI usage.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [DOCS] Placement of contrib modules in SGML documentation

2007-11-10 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
>> BTW, why are neither contrib/dblink nor contrib/spi included in the
>> conversion?

> I see dblink:
>   http://momjian.us/main/writings/pgsql/sgml/dblink.html

Oh, in that case the question is why the contrib/db/doc/ files are
still there.

> I assume spi wasn't done because it is just examples of SPI usage.

The .example files are fine, but what about the two README files?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [DOCS] Placement of contrib modules in SGML documentation

2007-11-10 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> >> BTW, why are neither contrib/dblink nor contrib/spi included in the
> >> conversion?
> 
> > I see dblink:
> > http://momjian.us/main/writings/pgsql/sgml/dblink.html
> 
> Oh, in that case the question is why the contrib/db/doc/ files are
> still there.

Because I thought only READMEs were converted.  Removed now.

> > I assume spi wasn't done because it is just examples of SPI usage.
> 
> The .example files are fine, but what about the two README files?

My guess is they were not converted because they are READMEs that
related to the examples.  We can convert them if people want.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq