Re: [DOCS] Placement of contrib modules in SGML documentation

2007-11-11 Thread Albert Cervera i Areny
A Diumenge 11 Novembre 2007, Tom Lane va escriure:
> I think there's a case for putting these pages under Part V Server
> Programming (though a few are not in fact server-side code), or under
> Part VI Reference (ignoring the fact that most of the text isn't in a
> uniform reference-page style ... though maybe we could plan to work
> towards that) or under Appendixes (though I'm sure there are people
> who will complain about that because their private agenda is to make
> these things as prominent as possible).  Or we could make them a new
> top-level Part, probably just after Reference.
>

That's where I put them initialy AFAIR but somebody complained they should be 
in the Reference. Maybe if we now agree that's not the appropiate place we 
can move them to a new Part.

-- 
Albert Cervera i Areny
http://www.NaN-tic.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] High Availability, Load Balancing, and Replication Feature Matrix

2007-11-11 Thread Markus Schiltknecht

Hello Bruce,

thank you for your detailed answer.

Bruce Momjian wrote:

Not sure if you were around when we wrote this chapter but there was a
lot of good discussion to get it to where it is now.


Uh.. IIRC quite a good part of the discussion for chapter 23 was between 
you and me, pretty exactly a year ago. Or what discussion are you 
referring to?


First of all, I don't quite like the negated formulations. I can see 
that you want a dot to mark a positive feature, but I find it hard to 
understand.


Well, the idea is to say "what things do I want and what offers it?"  If
you have positive/negative it makes it harder to do that.  I realize it
is confusing in a different way.  We could split out the negatives into
a different table but that seems worse.


Hm.. yeah, I can understand that. As those are thing the user wants, I 
think we could formulate positive wishes. Just a proposal:


No special hardware required:works with commodity hardware

No conflict resolution necessary:maintains durability property

master failure will never lose data: maintains durability
 on single node failure

With the other two I'm unsure.. I see it's very hard to find helpful 
positive formulations...


I'm especially puzzled about is the "master never locks others". All 
first four, namely "shared disk failover", "file system replication", 
"warm standby" and "master slave replication", block others (the slaves) 
completely, which is about the worst kind of lock.


That item assumes you have slaves that are trying to do work.


Yes, replication in general assumes that. So does high availability, 
IMO. Having read-only slaves means nothing else but locking them from 
write access.



The point
is that multi-master slows down the other slaves in a way no other
option does,


Uh.. you mean the other masters? But according to that statement, "async 
multi-master replication" as well as "statement-based replication 
middleware" should not have a dot, because those as well slow down other 
masters. In the async case at different points in time, yes, but all 
master have to write the data, which slows them down.


I'm suspecting you are rather talking about the network dependent commit 
latency of eager replication solutions. I find the term "locking delay" 
for that rather confusing. How about: "normal commit latency"? (Normal, 
as in: depends on the storage system used, instead of on the network and 
storage).



which is the reason we don't support it yet.


Uhm.. PgCluster *is* a synchronous multi-master replication solution. It 
also is a middleware and it does statement based replication. Which dots 
of the matrix do you think apply for it?


Comparing between "File System Replication" and "Shared Disk Failover", 
you state that the former has "master server overhead", while the later 
doesn't. Seen solely from the single server node, this might be true. 
But summarized over the cluster, you have a network with a quite similar 
load in both cases. I wouldn't say one has less overhead than the other 
per definition.


The point is that file system replication has to wait for the standby
server to write the blocks, while disk failover does not.


In "disk failover", the master has to wait for the NAS to write the 
blocks on mirrored disks, while in "file system replication" the master 
has to wait for multiple nodes to write the blocks. As the nodes of a 
replicated file system can write in parallel, very much like a RAID-1 
NAS, I don't see that much of a difference there.



I don't think
the network is an issue considering many use NAS anyway.


I think you are comparing an enterprise NAS to a low-cost, commodity 
hardware clustered filesystem. Take the same amount of money and the 
same number of mirrors and you'll get comparable performance.



There is no dot there so I am saying "statement based replication
solution" requires conflict resolution.  Agreed you could do it without
conflict resolution and it is kind of independent.  How should we deal
with this?


Maybe a third state: 'n/a'?

And in the special case of (async, but eager) Postgres-R also to "async 
multi-master replication" and "no conflict resolution necessary". 
Although I can understand that that's a pretty nifty difference.


Yea, the table isn't going to be 100% but tries to summarize what in the
section above.


That's fine.

> [...]
>

Right, but the point of the chart is go give people guidance, not to
give them details;  that is in the part above.


Well, sure. But then we are back at the discussion of the parts above, 
which is quite fuzzy, IMO. I'm still missing those details. And I'm 
dubious about it being a basis for a feature matrix with clear dots or 
no dots. For the reasons explained above.


IMO, "data partitioning" is entirely perpendicular to replication. It 
can be combined, in various ways. There's horizontal and vertical 
partitioning, eager/lazy and single-/multi-master replication.

Re: [DOCS] [PATCHES] Contrib docs v1

2007-11-11 Thread Bruce Momjian
Albert Cervera i Areny wrote:
> A Diumenge 11 Novembre 2007, Bruce Momjian va escriure:
> > Albert Cervera i Areny wrote:
> > > Sorry, I missed them, indeed I packed btree_gist instead of the good one
> > > btree-gist. The cube and xml2 have been added to.
> >
> > Thanks.  All applied.  I know people liked the README files in each
> > /contrib directory but we have no chance of keeping them in sync with
> > the SGML so I removed them.
> >
> > I still have lots of adjustments to make but at least it is in.
> 
> I know there are many things to improve but as you say at least it is in. Now 
> we can improve it incrementally.
> 
> >
> > Albert, can you do the new dict_int and dict_xsyn READMEs.  I could do
> > them but I am afraid I would not do as consistent of a job as you did.
> > Thanks.  Those README's are still in CVS /contrib, of course.
> 
> Of course. I'll send them ASAP.

Uh, don't start converting them yet.  I now realize that
/contrib/dict_int is an example and just like the stuff in /contrib/spi
perhaps shouldn't have its docs moved to SGML.  Is /contrib/dict_xsyn
also just an example?  Should we leave example /contrib modules
documented in READMEs?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] High Availability, Load Balancing, and Replication Feature Matrix

2007-11-11 Thread Bruce Momjian
Markus Schiltknecht wrote:
> Hello Bruce,
> 
> thank you for your detailed answer.
> 
> Bruce Momjian wrote:
> > Not sure if you were around when we wrote this chapter but there was a
> > lot of good discussion to get it to where it is now.
> 
> Uh.. IIRC quite a good part of the discussion for chapter 23 was between 
> you and me, pretty exactly a year ago. Or what discussion are you 
> referring to?

Sorry, I forgot who was involved in that discussion.

> >> First of all, I don't quite like the negated formulations. I can see 
> >> that you want a dot to mark a positive feature, but I find it hard to 
> >> understand.
> > 
> > Well, the idea is to say "what things do I want and what offers it?"  If
> > you have positive/negative it makes it harder to do that.  I realize it
> > is confusing in a different way.  We could split out the negatives into
> > a different table but that seems worse.
> 
> Hm.. yeah, I can understand that. As those are thing the user wants, I 
> think we could formulate positive wishes. Just a proposal:
> 
> No special hardware required:works with commodity hardware
> 
> No conflict resolution necessary:maintains durability property
> 
> master failure will never lose data: maintains durability
>   on single node failure
> 
> With the other two I'm unsure.. I see it's very hard to find helpful 
> positive formulations...

Yea, that's where I got stuck --- that the positives were harder to
understand.

> >> I'm especially puzzled about is the "master never locks others". All 
> >> first four, namely "shared disk failover", "file system replication", 
> >> "warm standby" and "master slave replication", block others (the slaves) 
> >> completely, which is about the worst kind of lock.
> > 
> > That item assumes you have slaves that are trying to do work.
> 
> Yes, replication in general assumes that. So does high availability, 
> IMO. Having read-only slaves means nothing else but locking them from 
> write access.
>
> > The point
> > is that multi-master slows down the other slaves in a way no other
> > option does,
> 
> Uh.. you mean the other masters? But according to that statement, "async 

Sorry, I meant that a master that is modifying data is slowed down by
other masters to an extent that doesn't happen in other cases (e.g. with
slaves).  Is the current "No inter-server locking delay" OK?

> multi-master replication" as well as "statement-based replication 
> middleware" should not have a dot, because those as well slow down other 
> masters. In the async case at different points in time, yes, but all 
> master have to write the data, which slows them down.

Yea, that is why I have the new text about locking.

> I'm suspecting you are rather talking about the network dependent commit 
> latency of eager replication solutions. I find the term "locking delay" 
> for that rather confusing. How about: "normal commit latency"? (Normal, 
> as in: depends on the storage system used, instead of on the network and 
> storage).

Uh, I assume that multi-master locking happens often before the commit.

> > which is the reason we don't support it yet.
> 
> Uhm.. PgCluster *is* a synchronous multi-master replication solution. It 
> also is a middleware and it does statement based replication. Which dots 
> of the matrix do you think apply for it?

I don't consider PgCluster middleware because the servers have to
cooperate with the middleware.  And I am told it is much slower for
writes than a single server which supports my "locking" item, though it
is more "waiting for other masters" that is the delay, I think.

> >> Comparing between "File System Replication" and "Shared Disk Failover", 
> >> you state that the former has "master server overhead", while the later 
> >> doesn't. Seen solely from the single server node, this might be true. 
> >> But summarized over the cluster, you have a network with a quite similar 
> >> load in both cases. I wouldn't say one has less overhead than the other 
> >> per definition.
> > 
> > The point is that file system replication has to wait for the standby
> > server to write the blocks, while disk failover does not.
> 
> In "disk failover", the master has to wait for the NAS to write the 
> blocks on mirrored disks, while in "file system replication" the master 
> has to wait for multiple nodes to write the blocks. As the nodes of a 
> replicated file system can write in parallel, very much like a RAID-1 
> NAS, I don't see that much of a difference there.

I don't assume the disk failover has mirrored disks.  It can just like a
single server can, but it isn't part of the backend process, and I
assume a RAID card that has RAM that can cache writes.  In the file
system replication case the server is having to send commands to the
mirror and wait for completion.

> > I don't think
> > the network is an issue considering many use NAS anyway.
> 
> I think you are comparing an enterprise NAS to a low-cost, commo

Re: [DOCS] Placement of contrib modules in SGML documentation

2007-11-11 Thread Bruce Momjian
Bruce Momjian wrote:
> Tom Lane wrote:
> > I am still desperately unhappy with the choice to put the contrib docs
> > where they were put.  They are by no stretch of the imagination part of
> > the "SQL Language", and there is no defense for having inserted them
> > into the middle of the part, in front of substantially more widely
> > interesting information such as concurrency control.
> 
> I think we need to decide where they will go;  they are easy to move.
> 
> > Furthermore, labeling them "Standard Modules" is somebody's flight of
> > wishful thinking --- if they were installed by default, they'd deserve
> > such a title, but that's not happening any time soon.
> 
> That name needs adjustment too.

How about "Additional Supplied Modules"?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [DOCS] [PATCHES] Contrib docs v1

2007-11-11 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Uh, don't start converting them yet.  I now realize that
> /contrib/dict_int is an example and just like the stuff in /contrib/spi
> perhaps shouldn't have its docs moved to SGML.

What makes you realize any such thing?  You could make that argument for
test_parser, probably, but the dict modules are useful in their own
right.

> Should we leave example /contrib modules documented in READMEs?

What is the point of such a distinction?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [DOCS] [PATCHES] Contrib docs v1

2007-11-11 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Uh, don't start converting them yet.  I now realize that
> > /contrib/dict_int is an example and just like the stuff in /contrib/spi
> > perhaps shouldn't have its docs moved to SGML.
> 
> What makes you realize any such thing?  You could make that argument for
> test_parser, probably, but the dict modules are useful in their own
> right.
> 
> > Should we leave example /contrib modules documented in READMEs?
> 
> What is the point of such a distinction?

If the contrib value is the source code itself then it seems a README is
more appropriate as people are not going to install the contrib module
itself --- they are using it only to learn.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [DOCS] [PATCHES] Contrib docs v1

2007-11-11 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> What is the point of such a distinction?

> If the contrib value is the source code itself then it seems a README is
> more appropriate as people are not going to install the contrib module
> itself --- they are using it only to learn.

I find this argument pretty unconvincing.  Most of the contrib modules
serve at least some purpose as coding examples.

If we set up a situation where all but one or two are documented in the
main SGML docs, the net effect will be that people don't even know that
the ones left out exist.  Even a module that's only useful as an example
won't be useful at all, if people don't find it.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [DOCS] [PATCHES] Contrib docs v1

2007-11-11 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> What is the point of such a distinction?
> 
> > If the contrib value is the source code itself then it seems a README is
> > more appropriate as people are not going to install the contrib module
> > itself --- they are using it only to learn.
> 
> I find this argument pretty unconvincing.  Most of the contrib modules
> serve at least some purpose as coding examples.
> 
> If we set up a situation where all but one or two are documented in the
> main SGML docs, the net effect will be that people don't even know that
> the ones left out exist.  Even a module that's only useful as an example
> won't be useful at all, if people don't find it.

Makes sense.  Albert, can you do the remaining READMEs?  You can skip
tsearch2 and the Japanese ones.  That is going away before final 8.3.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend