Greg,

> 1) Collect up data about their system (memory, disk layout), find out a
> bit about their apps/workload, and generate a config file based on that.

We could start with this.  Where I bogged down is that collecting system 
information about several different operating systems ... and in some cases 
generating scripts for boosting things like shmmax ... is actually quite a 
large problem from a slog perspective; there is no standard way even within 
Linux to describe CPUs, for example. Collecting available disk space 
information is even worse.   So I'd like some help on this portion.

I actually have algorithms which are "good enough to start with" for most of 
the important GUCs worked out, and others could be set through an interactive 
script ("Does your application require large batch loads involving thousands 
or millions of updates in the same transaction?"  "How large (GB) do you 
expect your database to be?")

> 2) Connect to the database and look around.  Study the tables and some
> their stats, make some estimates based on what your find, produce a new
> config file.

I'm not sure that much more for (2) can be done than for (1).  Tables-on-disk 
don't tell us much.  

> 3) Monitor the database while it's doing its thing.  See which parts go
> well and which go badly by viewing database statistics like pg_statio.
> From that, figure out where the bottlenecks are likely to be and push more
> resources toward them.  What I've been working on lately is exposing more
> readouts of performance-related database internals to make this more
> practical.

We really should collaborate on that.  

> When first exposed to this problem, most people assume that (1) is good
> enough--ask some questions, look at the machine, and magically a
> reasonable starting configuration can be produced.  It's already been
> pointed out that anyone with enough knowledge to do all that can probably
> spit out a reasonable guess for the config file without help.  

But that's actually more than most people already do.  Further, if you don't 
start with a "reasonable" configuration, then it's difficult-impossible to 
analyze where your settings are out-of-whack; behavior introduced by some 
way-to-low settings will mask any other tuning that needs to be done.  It's 
also hard/impossible to devise tuning algorithms that work for both gross 
tuning (increase shared_buffers by 100x) and fine tuning (decrease 
bgwriter_interval to 45ms).

So whether or not we do (3), we need to do (1) first.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq

Reply via email to