Re: [PERFORM] Scaling further up

2004-03-04 Thread Anjan Dave
Great response, Thanks.

Regarding 12GB memory and 13G db, and almost no I/O, one thing I don't
understand is that even though the OS caches most of the memory and PG
can use it if it needs it, why would the system swap (not much, only
during peak times)? The SHMMAX is set to 512MB, shared_buffers is 150MB,
effective cache size is 2GB, sort mem is 2MB, rest is default values. It
also happens that a large query (reporting type) can hold up the other
queries, and the load averages shoot up during peak times.

Regarding a baseline - 

-We have docs and monitor for frequency of sql statements, most
expensive ones, etc. (IronEye)
-I am monitoring disk reads/writes using iostat
-How do I measure commit frequency, and system events like checkpoint?
(vacuum is done nightly during less or no load)

Thanks,
Anjan


-Original Message-
From: Aaron W [mailto:[EMAIL PROTECTED] 
Sent: Thursday, March 04, 2004 8:58 AM
To: [EMAIL PROTECTED]; Anjan Dave
Subject: Re: Scaling further up



I'd look at adding more disks first. Depending on what
type of query
load you get, that box sounds like it will be very
much I/O bound

Given a a 13G database on a 12G system, with a low
growth rate, it is likely that there is almost no I/O
for most activities. The exception is checkpointing.
The first thing I'd do is try to build a spreadsheet
model of:
- select frequency, and # logical and physical reads
involved
- insert/delete/update frequency, and # logical and
physical read and writes involved
- commit frequency, etc.
(start out with simplistic assumptions, and do it for
peak load)
- system events (checkpoints, vacuum)

I assume that the only high I/O you will see will be
for logging. The RAID issue there is basically
obviated by the sequential write nature of WAL. If
that is the case, EMC is not the most cost effective
or obvious  solution - since the value they provide is
mostly manageability for disaster recovery. The goal
in this case is to write at the application max speed,
and with  mimimal latency. Any responsible battery
backed up write through (mirrored) cached controller
can do that for you.

On the other hand, if your requests are not *all*
trivial, you are going to test the hardware and
scheduling algorithms of OS and pg. Even if 0.1% of
3,000 tps take a second - that ends up generating 3
seconds of load Any, even slightly, slow
transactions will generate enormous queues which slow
down everything. 

In most systems of this volume I've seen, the mix of
activities is constantly invalidating cache, making L2
caching less important. Memory to CPU bus speed is a
limiting factor, as well as raw CPU speed in
processing the requests. Xeon is not a great
architecture for this because of FSB contention; I
suspect a 4-way will be completely FSB bottlenecked so
a more than 4 way would likely not change performance.


I would try to get a simple model/benchmark going and
test against it. You should be talking to the big iron
vendors for their take on your issues and get their
capacity benchmarks.

__
Do you Yahoo!?
Yahoo! Search - Find what you're looking for faster
http://search.yahoo.com

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Scaling further up

2004-03-04 Thread scott.marlowe
On Thu, 4 Mar 2004, Paul Thomas wrote:

> On 03/03/2004 18:23 scott.marlowe wrote:
> > [snip]
> > There are three factors that affect how fast you can get to the next
> > sector:
> > 
> > seek time
> > settle time
> > rotational latency
> > 
> > Most drives only list the first, and don't bother to mention the other
> > two.
> 
> Ah yes, one of my (very) few still functioning brain cells was nagging 
> about another bit of time in the equation :)
> 
> > On many modern drives, the seek times are around 5 to 10 milliseconds.
> > [snip]
> 
> Going back to the OPs posting about random_page_cost, imagine I have 2 
> servers identical in every way except the disk drive. Server A has a 10K 
> rpm drive and server B has a 15K rpm drive. Seek/settle times aren't 
> spectacularly different between the 2 drives. I'm wondering if drive B 
> might actually merit a _higher_ random_page_cost than drive A as, once it 
> gets settled on a disk track, it can suck the data off a lot faster. 
> opinions/experiences anyone?

It might well be that you have higher settle times that offset the small 
gain in rotational latency.  I haven't looked into it, so I don't know one 
way or the other, but it seems a reasonable assumption.

However, a common misconception is that the higher angular velocity of 
the 15krpm drives would allow you to read data faster.  In fact, the limit 
of how fast you can read is set by the head.  There's a maximum frequency 
that it can read, and the areal density / rpm have to be such that you 
don't exceed that frequency.  OFten, the speed at which you read off the 
platters is exactly the same between a 10k and 15k of the same family.  

The required lower areal density is the reason 15krpm drives show up in 
the lower capacities first.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings