Re: [DOCS] High Availability, Load Balancing, and Replication Feature Matrix
Hello Bruce, Bruce Momjian wrote: I have added a High Availability, Load Balancing, and Replication Feature Matrix table to the docs: Nice work. I appreciate your efforts in clearing up the uncertainty that surrounds this topic. As you might have guessed, I have some complaints regarding the Feature Matrix. I hope this won't discourage you, but I'd rather like to contribute to an improved variant. First of all, I don't quite like the negated formulations. I can see that you want a dot to mark a positive feature, but I find it hard to understand. I'm especially puzzled about is the "master never locks others". All first four, namely "shared disk failover", "file system replication", "warm standby" and "master slave replication", block others (the slaves) completely, which is about the worst kind of lock. Comparing between "File System Replication" and "Shared Disk Failover", you state that the former has "master server overhead", while the later doesn't. Seen solely from the single server node, this might be true. But summarized over the cluster, you have a network with a quite similar load in both cases. I wouldn't say one has less overhead than the other per definition. Then, you are mixing apples and oranges. Why should a "statement based replication solution" not require conflict resolution? You can build eager as well as lazy statement based replication solutions, that does not have anything to do with the other, does it? Same applies to "master slave replication" and "per table granularity". And in the special case of (async, but eager) Postgres-R also to "async multi-master replication" and "no conflict resolution necessary". Although I can understand that that's a pretty nifty difference. Given the matrix focuses on practically available solutions, I can see some value in it. But from a more theoretical viewpoint, I find it pretty confusing. Now, if you want a practically usable feature comparison table, I'd strongly vote for clearly mentioning the products you have in mind - otherwise the table pretends to be something it is not. If it should be theoretically correct without mentioning available solutions, I'd rather vote for explaining the terms and concepts. To clarify my viewpoint, I'll quickly go over the features you're mentioning and associate them with the concepts, as I understand them. - special hardware: always nice, not much theoretical effect, a network is a network, storage is storage. - multiple masters: that's what single- vs multi masters is about: writing transactions. Can be mixed with eager/lazy, every combination makes sense for certain applications. - overhead: replication per definition generates overhead, question is: how much, and where. - locking of others: again, question of how much and how fine grained the locking is. In a single master repl. sol., the slaves are locked completely. In lazy repl. sol., the locking is deferred until after the commit, during conflict resolution. In eager repl. sol., the locking needs to take place before the commit. But all replication systems need some kind of locks! - data loss on fail: solely dependent on eager/lazy. (Given a real replication, with a replica, which shared storage does not provide, IMO) - slaves read only: theoretically possible with all replication system, are they lazy/eager, single-/multi- master. That we are unable to read from slave nodes is an implementation annoyance of Postgres, if you want. - per table gran.: again, independent of lazy/eager, single-/multi. Depends solely on the level where data is replicated: block device, file system, statement, WAL or other internal format. - conflict resol.: in multi master systems, that depends on the lazy/eager property. Single master systems obviously never need to resolve conflicts. IMO, "data partitioning" is entirely perpendicular to replication. It can be combined, in various ways. There's horizontal and vertical partitioning, eager/lazy and single-/multi-master replication. I guess we could find a use case for most of the combinations thereof. (Kudos for finding a combination which definitely has no use case). Well, these are my theories, do with it whatever you like. Comments appreciated. Kind regards Markus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [DOCS] High Availability, Load Balancing, and Replication Feature Matrix
Markus Schiltknecht wrote: > Hello Bruce, > > Bruce Momjian wrote: > > I have added a High Availability, Load Balancing, and Replication > > Feature Matrix table to the docs: > > Nice work. I appreciate your efforts in clearing up the uncertainty that > surrounds this topic. > > As you might have guessed, I have some complaints regarding the Feature > Matrix. I hope this won't discourage you, but I'd rather like to > contribute to an improved variant. Not sure if you were around when we wrote this chapter but there was a lot of good discussion to get it to where it is now. > First of all, I don't quite like the negated formulations. I can see > that you want a dot to mark a positive feature, but I find it hard to > understand. Well, the idea is to say "what things do I want and what offers it?" If you have positive/negative it makes it harder to do that. I realize it is confusing in a different way. We could split out the negatives into a different table but that seems worse. > I'm especially puzzled about is the "master never locks others". All > first four, namely "shared disk failover", "file system replication", > "warm standby" and "master slave replication", block others (the slaves) > completely, which is about the worst kind of lock. That item assumes you have slaves that are trying to do work. The point is that multi-master slows down the other slaves in a way no other option does, which is the reason we don't support it yet. I have updated the wording to "No inter-server locking delay". > Comparing between "File System Replication" and "Shared Disk Failover", > you state that the former has "master server overhead", while the later > doesn't. Seen solely from the single server node, this might be true. > But summarized over the cluster, you have a network with a quite similar > load in both cases. I wouldn't say one has less overhead than the other > per definition. The point is that file system replication has to wait for the standby server to write the blocks, while disk failover does not. I don't think the network is an issue considering many use NAS anyway. > Then, you are mixing apples and oranges. Why should a "statement based > replication solution" not require conflict resolution? You can build > eager as well as lazy statement based replication solutions, that does > not have anything to do with the other, does it? There is no dot there so I am saying "statement based replication solution" requires conflict resolution. Agreed you could do it without conflict resolution and it is kind of independent. How should we deal with this? > Same applies to "master slave replication" and "per table granularity". I tried to mark them based on existing or typical solutions, but you are right, especially if the master/slave is not PITR based. Some can't do per-table, like disk failover. > And in the special case of (async, but eager) Postgres-R also to "async > multi-master replication" and "no conflict resolution necessary". > Although I can understand that that's a pretty nifty difference. Yea, the table isn't going to be 100% but tries to summarize what in the section above. > Given the matrix focuses on practically available solutions, I can see > some value in it. But from a more theoretical viewpoint, I find it > pretty confusing. Now, if you want a practically usable feature > comparison table, I'd strongly vote for clearly mentioning the products > you have in mind - otherwise the table pretends to be something it is not. I considered that and I can add something that says you have to consider the text above for more details. Some require solution mentions, Slony, while others do not, like disk failover. > If it should be theoretically correct without mentioning available > solutions, I'd rather vote for explaining the terms and concepts. > > To clarify my viewpoint, I'll quickly go over the features you're > mentioning and associate them with the concepts, as I understand them. > > - special hardware: always nice, not much theoretical effect, a >network is a network, storage is storage. > > - multiple masters: that's what single- vs multi masters is about: >writing transactions. Can be mixed with >eager/lazy, every combination makes >sense for certain applications. > > - overhead: replication per definition generates overhead, >question is: how much, and where. > > - locking of others: again, question of how much and how fine grained >the locking is. In a single master repl. sol., the >slaves are locked completely. In lazy repl. sol., >the locking is deferred until after the commit, >during conflict resolution. In eager repl. sol., >the locking needs to take place before the commit
[DOCS] Placement of contrib modules in SGML documentation
I am still desperately unhappy with the choice to put the contrib docs where they were put. They are by no stretch of the imagination part of the "SQL Language", and there is no defense for having inserted them into the middle of the part, in front of substantially more widely interesting information such as concurrency control. Furthermore, labeling them "Standard Modules" is somebody's flight of wishful thinking --- if they were installed by default, they'd deserve such a title, but that's not happening any time soon. I think there's a case for putting these pages under Part V Server Programming (though a few are not in fact server-side code), or under Part VI Reference (ignoring the fact that most of the text isn't in a uniform reference-page style ... though maybe we could plan to work towards that) or under Appendixes (though I'm sure there are people who will complain about that because their private agenda is to make these things as prominent as possible). Or we could make them a new top-level Part, probably just after Reference. As for the title, how about "Available Add-On Modules", or something like that? BTW, why are neither contrib/dblink nor contrib/spi included in the conversion? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [DOCS] Placement of contrib modules in SGML documentation
Tom Lane wrote: > I am still desperately unhappy with the choice to put the contrib docs > where they were put. They are by no stretch of the imagination part of > the "SQL Language", and there is no defense for having inserted them > into the middle of the part, in front of substantially more widely > interesting information such as concurrency control. I think we need to decide where they will go; they are easy to move. > Furthermore, labeling them "Standard Modules" is somebody's flight of > wishful thinking --- if they were installed by default, they'd deserve > such a title, but that's not happening any time soon. That name needs adjustment too. > I think there's a case for putting these pages under Part V Server > Programming (though a few are not in fact server-side code), or under > Part VI Reference (ignoring the fact that most of the text isn't in a > uniform reference-page style ... though maybe we could plan to work > towards that) or under Appendixes (though I'm sure there are people > who will complain about that because their private agenda is to make > these things as prominent as possible). Or we could make them a new > top-level Part, probably just after Reference. I think appendix is the right place myself. > As for the title, how about "Available Add-On Modules", or something > like that? Yea, that is better. Someone didn't want "contrib" mentioned in the title. The problem with "Available" is that it doesn't include pgfoundry stuff which is _available_ too, just not shipped. > BTW, why are neither contrib/dblink nor contrib/spi included in the > conversion? I see dblink: http://momjian.us/main/writings/pgsql/sgml/dblink.html I assume spi wasn't done because it is just examples of SPI usage. -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [DOCS] Placement of contrib modules in SGML documentation
Bruce Momjian <[EMAIL PROTECTED]> writes: >> BTW, why are neither contrib/dblink nor contrib/spi included in the >> conversion? > I see dblink: > http://momjian.us/main/writings/pgsql/sgml/dblink.html Oh, in that case the question is why the contrib/db/doc/ files are still there. > I assume spi wasn't done because it is just examples of SPI usage. The .example files are fine, but what about the two README files? regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [DOCS] Placement of contrib modules in SGML documentation
Tom Lane wrote: > Bruce Momjian <[EMAIL PROTECTED]> writes: > >> BTW, why are neither contrib/dblink nor contrib/spi included in the > >> conversion? > > > I see dblink: > > http://momjian.us/main/writings/pgsql/sgml/dblink.html > > Oh, in that case the question is why the contrib/db/doc/ files are > still there. Because I thought only READMEs were converted. Removed now. > > I assume spi wasn't done because it is just examples of SPI usage. > > The .example files are fine, but what about the two README files? My guess is they were not converted because they are READMEs that related to the examples. We can convert them if people want. -- Bruce Momjian <[EMAIL PROTECTED]>http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
