Re: [DOCS] [HACKERS] Replication documentation addition

2006-11-06 Thread Chris Browne
[EMAIL PROTECTED] (Bruce Momjian) writes:
> With no new additions submitted today, I have moved my text into our
> SGML documentation:
>
>   http://momjian.us/main/writings/pgsql/sgml/failover.html
>
> Please let me know what additional changes are needed.

It's looking a lot improved to me...

There are still numerous places where it needs s/Slony/Slony-I/g
because there is more than one thing out there called "Slony," only
one of which is the single-master-to-multiple-subscribers-asynchronous
replication system...



"This can be complex to set up because functions like random() and
CURRENT_TIMESTAMP will have different values on different servers, and
sequences should be consistent across servers."

It doesn't make sense to call this "complex to set up."  This problem
isn't about complexity of setup; it is about whether updates are
processed identically on different hosts.  

Perhaps better:

"Query broadcasting can break down such that servers fall out of sync
if the queries have nondeterministic behavior.  For instance,
functions like random(), CURRENT_TIMESTAMP, and
nextval('some_sequence') will take on different values on different
servers.  Care must be taken at the application level to make sure
that queries are all fully deterministic and that they either COMMIT
or ABORT on all servers."


"24.6. Clustering For Load Balancing

In clustering, each server can accept write requests, and these write
requests are broadcast from the original server to all other servers
before each transaction commits. Under heavy load, this can cause
excessive locking and performance degradation. It is implemented by
Oracle in their RAC product. PostgreSQL does not offer this type of
load balancing, though PostgreSQL two-phase commit can be used to
implement this in application code or middleware."

Something doesn't feel entirely right here...

How about...

"24.6. Multimaster Replication For Load Balancing

In this scenario, each server can accept write requests, which are
broadcast from the original server to all other servers before each
transaction commits in order to ensure consistency.  Unfortunately,
under heavy load, the cost of distributing locks across servers can
lead to substantial performance degradation. It is implemented by
Oracle in their RAC product. PostgreSQL does not offer this type of
load balancing, though PostgreSQL two-phase commit using  and  may be used to implement this in
application code or middleware.

The communications costs involved in distributing locks and writes
have the result that write operations are considerably more expensive
than they would be on a single server.  In general, the cost of
distributed locking means that this clustering approach is only usable
across a cluster of servers at a local site.  

There will only be a performance "win" if the cluster mostly processes
read-only traffic that the cluster can distribute across a larger
number of database servers.  Write performance generally degrades a
fair bit as compared to using a single database server.  Reliability
should be enhanced since the cluster should be able to continue work
even if some of the members of the cluster should fail."



"24.7. Clustering For Parallel Query Execution

This allows multiple servers to work on a single query. One possible
way this could work is for the data to be split among servers and for
each server to execute its part of the query and results sent to a
central server to be combined and returned to the user. There
currently is no PostgreSQL open source solution for this."

This seems a bit thin.

"24.7. Clustering For Parallel Query Execution

This allows multiple servers to work concurrently on a single query,
analagous to the way RAID permits multiple disk drives to respond
concurrently to disk I/O requests.

One way this could work is for the data to be partitioned across the
servers, where each server executes its part of the query, submitting
results to a central server to be combined and returned to the user.
There currently is no PostgreSQL open source solution for this."
-- 
select 'cbbrowne' || '@' || 'acm.org';
http://cbbrowne.com/info/advocacy.html
Why do we put suits in a garment bag, and put garments in a suitcase? 

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [DOCS] [HACKERS] "recovering prepared transaction" after serverrestart

2006-11-06 Thread Richard Troy

On Fri, 3 Nov 2006, Tom Lane wrote:
>
> > Is there a way to see prepared transactions where the original session
> > that prepared then has died? Perhaps the message at startup should be
> > "you have at least one prepared transaction that needs resolution".
>
> I am completely baffled by this focus on database startup time.  That's
> not where the problem is.
>
>   regards, tom lane
>

I'm not alluding to anyone in particular, just responding to the focus on
startup time; When I joined Ingres as a Consultant (back when that was a
revered job), we saw this a lot, too, bubbling through the ranks from
technical support. Engineering was having a cow over it. We Consultants
were expected to backline such problems and be the interface between
engineering and the rest of the world. What we found was that in what we'd
call the ligitimate cases, the cause for concern over startup time had to
do with bugs that forced, one way or another, a server restart.

Illigitimate cases - the VAST majority - were the result of, well, let's
call them less-than-successful DBAs, thrashing their installations with
their management breathing down their necks, often with flailing arms and
fire coming out of their mouths saying things like, "I bet my business on
this!"... The usual causes there were inappropriate configurations, and a
critical cause of _that_ was an instalation toolset that didn't help
people size/position things properly. Often a sales guy or trainee would
configure a test system and then the customer would put that into
production without ever reexamining the settings.

I realized there was an opportunity here; I put together a training
program and we sold it as a service along with installation to new
customers to help them get off on the right foot. Once we did that, new
customers were essentially put on notice that they could either pay us to
help set them up, or they could do it, but that continuing along with what
the salesman or junior techie had done wasn't sufficient for a production
environment that you could bet your business on. ...The complaint and
concern about startup time dropped out of sight nearly immediately...

Opportunity here, for PostgreSql: A Technical Document of some kind
entitled something like: "How to move your testing environment into
production."

No, unfortunately, I can't volunteer to be the point person on this one.
And to the underlying question: is this the case with PostgreSql? I can't
say...

Regards,
Richard


-- 
Richard Troy, Chief Scientist
Science Tools Corporation
510-924-1363 or 202-747-1263
[EMAIL PROTECTED], http://ScienceTools.com/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq