Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-07-05 Thread Magnus Hagander
Tom Lane wrote:
 PFC [EMAIL PROTECTED] writes:
 What version of PostgreSQL are you using?
 
  I think newbies should be pushed a bit to use the latest versions,
 
 How about pushed *hard* ?  I'm constantly amazed at the number of people
 who show up in the lists saying they installed 7.3.2 or whatever random
 version they found in a dusty archive somewhere.  Please upgrade is at
 least one order of magnitude more valuable configuration advice than
 anything else we could tell them.

(picking up an old thread while at a boring wait at the airport.. anyway)

I keep trying to think of more nad better ways to do this :-) Perhaps we
should put some text on the bug reporting form (and in the documentation
about bug reporting) that's basically don't bother reporting a bug
unless you're on the latest in a branch, and at least make sure you're
on one of the maojr releases listed on www.postgresql.org?

Seems reasonable?

//Magnus



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-07-05 Thread Harald Armin Massa

Magnus,

don't bother reporting a bug

unless you're on the latest in a branch, and at least make sure you're
on one of the maojr releases listed on www.postgresql.org?

Seems reasonable?



absolutely. Should be standard practice.

Harald

--
GHUM Harald Massa
persuadere et programmare
Harald Armin Massa
Spielberger Straße 49
70435 Stuttgart
0173/9409607
fx 01212-5-13695179
-
EuroPython 2007 will take place in Vilnius, Lithuania from Monday 9th July
to Wednesday 11th July. See you there!


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-27 Thread Josh Berkus
Greg,

 I'd like to see people have a really simple set of questions to get them
 past the completely undersized initial configuration phase, then ship them
 toward resources to help educate about the parts that could be problematic
 for them based on what they do or don't know.  I don't see an
 inconsistancy that I'd expect people to have a reasonable guess for
 max_connections, while also telling them that setting sort_mem is
 important, a middle value has been assigned, but a really correct setting
 isn't something they can expect the simple config tool to figure out for
 them; here's a pointer to the appropriate documentation to learn more.

I disagree that this is acceptable, especially when we could set a better 
value using an easy-to-understand question. It's also been my experience (in 
3 years of professional performance tuning) that most users *don't* have an 
accurate guess for max_connections.

I'm really not clear on why you think what flavor of application do you 
have? is a difficult question.  It's certainly one that my clients were able 
to answer easily.  Overall, it seems like you're shooting for a conf tool 
which only really works for web apps, which isn't my personal goal or I think 
a good use of our time.

 I'm still of the opinion that recommendations for settings like
 max_fsm_pages and maintenance_work_mem should come out of a different type
 of tool that connects to the database.

Well, there's several steps to this:

1) Run conf tool when installing PG;
2) Run conf tool++ after application is first up and running;
3) Run conf tool++ after application has been in production

The (1) tool should at least provide a configuration which isn't going to lead 
to long term issues.  For example, dramatically underallocating fsm_pages can 
result in having to run VACUUM FULL and the associated downtime, so it's 
something we want to avoid at the outset.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-26 Thread Josh Berkus
Greg,

 We've hashed through this area before, but for Lance's benefit I'll
 reiterate my dissenting position on this subject.  If you're building a
 tool for dummies, my opinion is that you shouldn't ask any of this
 information.  I think there's an enormous benefit to providing something
 that takes basic sizing information and gives conservative guidelines
 based on that--as you say, safe, middle-of-the-road values--that are
 still way, way more useful than the default values.  The risk in trying to
 make a complicated tool that satisfies all the users Josh is aiming his
 more sophisticated effort at is that you'll lose the newbies.

The problem is that there are no safe, middle-of-the-road values for some 
things, particularly max_connections and work_mem.  Particularly, there are 
very different conf profiles between reporting applications and OLTP/Web.  
We're talking about order-of-magnitude differences here, not just a few 
points.  e.g.:

Web app, typical machine:
max_connections = 200
work_mem = 256kb
default_statistics_target=100
autovacuum=on

Reporting app, same machine:
max_connections = 20
work_mem = 32mb
default_statistics_target=500
autovacuum=off

Possibly we could make the language of the application type selection less 
technical, but I don't see it as dispensible even for a basic tool.

 I wouldn't even bother asking how many CPUs somebody has for what Lance is
 building.  The kind of optimizations you'd do based on that are just too
 complicated to expect a tool to get them right and still be accessible to
 a novice.

CPUs affects the various cpu_cost parameters, but I can but the idea that this 
should only be part of the advanced tool.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-26 Thread Greg Smith

On Tue, 26 Jun 2007, Josh Berkus wrote:


The problem is that there are no safe, middle-of-the-road values for some
things, particularly max_connections and work_mem.


Your max_connections concern is one fact that haunts the idea of just 
giving out some sample configs for people.  Lance's tool asks outright the 
expectation for max_connections which I think is the right thing to do.



Web app, typical machine:
work_mem = 256kb
default_statistics_target=100
autovacuum=on



Reporting app, same machine:
work_mem = 32mb
default_statistics_target=500
autovacuum=off


I think people are stuck with actually learning a bit about work_mem 
whether they like it or not, because it's important to make it larger but 
we know going too high will be a problem with lots of connections doing 
sorts.


As for turning autovacuum on/off and the stats target, I'd expect useful 
defaults for those would come out of how the current sample is asking 
about read vs. write workloads and expected database size.  Those simple 
to understand questions might capture enough of the difference between 
your two types here.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-26 Thread Josh Berkus
Greg,

 Your max_connections concern is one fact that haunts the idea of just
 giving out some sample configs for people.  Lance's tool asks outright the
 expectation for max_connections which I think is the right thing to do.
...
 I think people are stuck with actually learning a bit about work_mem
 whether they like it or not, because it's important to make it larger but
 we know going too high will be a problem with lots of connections doing
 sorts.

I find it extremely inconsistent that you want to select middle-of-the-road 
defaults for some values and ask users detailed questions for other values. 
Which are we trying to do, here?

Given an application type selection, which is a question which can be 
written in easy-to-understand terms, these values can be set at reasonable 
defaults.  In fact, for most performance tuning clients I had, we never 
actually looped back and tested the defaults by monitoring pg_temp, memstat 
and the log; performance was acceptable with the approximate values.

 As for turning autovacuum on/off and the stats target, I'd expect useful
 defaults for those would come out of how the current sample is asking
 about read vs. write workloads and expected database size.  Those simple
 to understand questions might capture enough of the difference between
 your two types here.

Both of the questions you cite above are unlikely to result in accurate 
answers from users, and the read vs. write answer is actually quite useless 
except for the extreme cases (e.g. read-only or mostly-write).  The deciding 
answer in turning autovacuum off is whether or not the user does large bulk 
loads / ETL operations, which autovac would interfere with.

The fact that we can't expect an accurate answer on database size (except from 
the minority of users who already have a full production DB) will be a 
chronic stumbling block for any conf tool we build.  Quite a number of 
settings want to know this figure: max_fsm_pages, maintenance_work_mem, 
max_freeze_age, etc.  Question is, would order-of-magnitude answers be likely 
to have better results?  i.e.:

How large is your database expected to grow?
[] Less than 100MB / thousands of rows
[] 100mb to 1gb / tens to hundreds of thousands of rows
[] 1 gb to 10 gb / millions of rows
[] 10 gb to 100 gb / tens to hundreds of millions of rows
[] 100 gb to 1 TB / billions of rows
[] more than 1 TB / many billions of rows

... users might have better guesses within those rough ranges, and it would 
give us enough data to set rough values.

-- 
Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-26 Thread Greg Smith

On Tue, 26 Jun 2007, Josh Berkus wrote:


I find it extremely inconsistent that you want to select middle-of-the-road
defaults for some values and ask users detailed questions for other values.
Which are we trying to do, here?


I'd like to see people have a really simple set of questions to get them 
past the completely undersized initial configuration phase, then ship them 
toward resources to help educate about the parts that could be problematic 
for them based on what they do or don't know.  I don't see an 
inconsistancy that I'd expect people to have a reasonable guess for 
max_connections, while also telling them that setting sort_mem is 
important, a middle value has been assigned, but a really correct setting 
isn't something they can expect the simple config tool to figure out for 
them; here's a pointer to the appropriate documentation to learn more.


The fact that we can't expect an accurate answer on database size 
(except from the minority of users who already have a full production 
DB) will be a chronic stumbling block for any conf tool we build.


I'm still of the opinion that recommendations for settings like 
max_fsm_pages and maintenance_work_mem should come out of a different type 
of tool that connects to the database.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies - feedback adjustable control

2007-06-25 Thread Sabin Coanda

[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 On Fri, 22 Jun 2007, Sabin Coanda wrote:

 Instead of (or in addition to) configure dozens of settings, what do you 
 say about a feedback adjustable control based on the existing system 
 statistics and parsing logs (e.g 
 http://pgfouine.projects.postgresql.org/index.html ) ?

 something like this would be useful for advanced tuneing, but the biggest 
 problem is that it's so difficult to fingoure out a starting point. bad 
 choices at the starting point can cause several orders of magnatude 
 difference in the database performsnce. In addition we know that the 
 current defaults are bad for just about everyone (we just can't decide 
 what better defaults would be)


You are right. But an automatic tool beeing able to take decisions by 
different inputs, would be able to set a startup configuration too, based on 
the hw/sw environment, and interactive user requirements.

 this horrible starting point gives people a bad first impression that a 
 simple tool like what's being discussed can go a long way towards solving.


Well, I think to an automatic tool, not an utopian application good for 
everything. For instance the existing automatic daemon have some abilities, 
bat not all of the VACUUM command. I'm realistic that good things may be 
done in steps, not once.

I would be super happy if an available automatic configuration tool would be 
able to set for the beginning just the shared_buffers or  max_fsm_pages 
based on the available memory. Adjustments can be done later.

Regards,
Sabin 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-23 Thread Greg Smith

On Fri, 22 Jun 2007, Adam Tauno Williams wrote:


Just as an aside; how come the installation/setup Tutorial section -
http://www.postgresql.org/docs/8.2/interactive/tutorial-start.html -
doesn't mention setting some rough reasonable defaults in
postgresql.conf or even a reference to the parameter documentation
section.


I think that anyone who has been working with the software long to know 
what should go into such a section has kind of forgotten about this part 
of the documentation by the time they get there.  It is an oversight and 
yours is an excellent suggestion.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies - feedback adjustable control

2007-06-23 Thread Greg Smith

On Fri, 22 Jun 2007, Sabin Coanda wrote:

Instead of (or in addition to) configure dozens of settings, what do you 
say about a feedback adjustable control based on the existing system 
statistics and parsing logs


Take a look at the archive of this list for the end of April/Early May. 
There's a thread there named Feature Request --- was: PostgreSQL 
Performance Tuning that addressed this subject in length I think you'll 
find interesting reading.


I personally feel there's much more long-term potential for a tool that 
inspects the database, but the needs of something looking for getting good 
starting configuration file (before there necessarily is even a populated 
database) is different enough that it may justify building two different 
tools.


I would suggest you or anything else building the starter configuration 
tool not stray from the path of getting the most important variables set 
to reasonable values.  Trying to satisfy every possible user is the path 
that leads to a design so complicated that it's unlikely you'll ever get a 
finished build done at all.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-22 Thread Greg Smith

On Thu, 21 Jun 2007, Scott Marlowe wrote:

Generally I agree, however, how about a first switch, for beginner / 
intermediate / advanced.


You're describing a perfectly reasonable approach for a second generation 
tool in this area.  I think it would be very helpful for the user 
community to get a first generation one that works fairly well before 
getting distracted at all by things like this.  The people capable of 
filling out the intermediate/advanced settings can probably just do a bit 
of reading and figure out most of what they should be doing themselves.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-22 Thread Adam Tauno Williams
On Fri, 2007-06-22 at 02:32 -0400, Greg Smith wrote:
 On Thu, 21 Jun 2007, Scott Marlowe wrote:
 
  Generally I agree, however, how about a first switch, for beginner / 
  intermediate / advanced.
 
 You're describing a perfectly reasonable approach for a second generation 
 tool in this area.  I think it would be very helpful for the user 
 community to get a first generation one that works fairly well before 
 getting distracted at all by things like this.  The people capable of 
 filling out the intermediate/advanced settings can probably just do a bit 
 of reading and figure out most of what they should be doing themselves.

Just as an aside; how come the installation/setup Tutorial section -
http://www.postgresql.org/docs/8.2/interactive/tutorial-start.html -
doesn't mention setting some rough reasonable defaults in
postgresql.conf or even a reference to the parameter documentation
section.  It seems like such a reference should exist between -
http://www.postgresql.org/docs/8.2/interactive/tutorial-arch.html - and
- http://www.postgresql.org/docs/8.2/interactive/tutorial-accessdb.html

At least something along those lines should be said at
http://www.postgresql.org/docs/8.2/interactive/install-post.html

Personally, as DBA for more than a decade,  I've got 0 sympathy for
people who setup a database but can't be bothered to read the
documentation.  But in the case of PostgreSQL the documentation could do
a better job of driving users to even the existence [and importance of]
postgresql.conf and routine maintenance techniques.
http://www.postgresql.org/docs/8.2/interactive/runtime-config.html
http://www.postgresql.org/docs/8.2/interactive/maintenance.html

Seems to me that even a remake of something like -
http://www.iiug.org/~waiug/old/forum2000/SQLTunning/sld001.htm - focused
on PostgreSQL would be novel and very interesting.

Just my two cents.  

PostgreSQL is awesome, BTW.




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies - feedback adjustable control

2007-06-22 Thread Sabin Coanda

  Campbell, Lance [EMAIL PROTECTED] wrote in message news:[EMAIL 
PROTECTED]
  Below is a link to the HTML JavaScript configuration page I am creating:

   

  http://www.webservices.uiuc.edu/postgresql/

   

  I had many suggestions.  Based on the feedback I received, I put together the 
initial list of questions.  This list of questions can be change.

   

Instead of  (or in addition to) configure dozens of settings, what do you say 
about a feedback adjustable control based on the existing system statistics and 
parsing logs (e.g http://pgfouine.projects.postgresql.org/index.html ) ?

Such an application improved with notifications would be useful for 
experimented users, too. A database is not static and it may evolve to 
different requirements. The initial configuration may be deprecated after one 
year.

Regards,
Sabin

Re: [PERFORM] PostgreSQL Configuration Tool for Dummies - feedback adjustable control

2007-06-22 Thread david

On Fri, 22 Jun 2007, Sabin Coanda wrote:

Instead of (or in addition to) configure dozens of settings, what do you 
say about a feedback adjustable control based on the existing system 
statistics and parsing logs (e.g 
http://pgfouine.projects.postgresql.org/index.html ) ?


something like this would be useful for advanced tuneing, but the biggest 
problem is that it's so difficult to fingoure out a starting point. bad 
choices at the starting point can cause several orders of magnatude 
difference in the database performsnce. In addition we know that the 
current defaults are bad for just about everyone (we just can't decide 
what better defaults would be)


this horrible starting point gives people a bad first impression that a 
simple tool like what's being discussed can go a long way towards solving.


David Lang

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-21 Thread Stefan Kaltenbrunner
Campbell, Lance wrote:
 Now I am at the difficult part, what parameters to calculate and how to
 calculate them.  Everything below has to do with PostgreSQL version 8.2:
 
  
 
 The parameters I would think we should calculate are:
 
 max_connections
 
 shared_buffers
 
 work_mem
 
 maintenance_work_mem
 
 effective_cache_size
 
 random_page_cost
 
  
 
 Any other variables?  I am open to suggestions.


we also should scale max_fsm_pages according to the database size and
workload answers - I also note that the configuration file it generates
seems to look like on for PostgreSQL 7.x or something - I think we
should just include the specific parameters to change.


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-21 Thread Ben Trewern

 Campbell, Lance [EMAIL PROTECTED] wrote in message 
 news:[EMAIL PROTECTED]
 Now I am at the difficult part, what parameters to calculate and how to
 calculate them.  Everything below has to do with PostgreSQL version 8.2:


 The parameters I would think we should calculate are:
 max_connections
 shared_buffers
 work_mem
 maintenance_work_mem
 effective_cache_size
 random_page_cost

 Any other variables?  I am open to suggestions.

I know this is mainly about tuning for performance but I do think you ought 
to give the option to change at least 'listen_address'.  Something like:

Accept connections on: - Local connections (Unix sockets/localhost)
- All TCP/IP interfaces
- Specific IP addresses: 
___ (comma-seperated list)

and maybe a pointer to the pg_hba.conf docs for further info.

Regards,

Ben 



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-21 Thread Scott Marlowe

Greg Smith wrote:

On Tue, 19 Jun 2007, Josh Berkus wrote:

I don't think the mostly reads / mostly writes question covers 
anything,
nor is it likely to produce accurate answers.  Instead, we need to 
ask the

users to characterize what type of application they are running:
T1) Please characterize the general type of workload you will be 
running on

this database.  Choose one of the following four...


We've hashed through this area before, but for Lance's benefit I'll 
reiterate my dissenting position on this subject.  If you're building 
a tool for dummies, my opinion is that you shouldn't ask any of this 
information.  I think there's an enormous benefit to providing 
something that takes basic sizing information and gives conservative 
guidelines based on that--as you say, safe, middle-of-the-road 
values--that are still way, way more useful than the default values.  
The risk in trying to make a complicated tool that satisfies all the 
users Josh is aiming his more sophisticated effort at is that you'll 
lose the newbies. 
Generally I agree, however, how about a first switch, for beginner / 
intermediate / advanced.


The choice you make determines how much detail we ask you about your 
setup.  Beginners get two or three simple questions, intermediate a 
handful, and advanced gets grilled on everything.  Then, just write the 
beginner and maybe intermediate to begin with and ghost out the advanced 
until it's ready.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-20 Thread Greg Smith

On Tue, 19 Jun 2007, Josh Berkus wrote:


I don't think the mostly reads / mostly writes question covers anything,
nor is it likely to produce accurate answers.  Instead, we need to ask the
users to characterize what type of application they are running:
T1) Please characterize the general type of workload you will be running on
this database.  Choose one of the following four...


We've hashed through this area before, but for Lance's benefit I'll 
reiterate my dissenting position on this subject.  If you're building a 
tool for dummies, my opinion is that you shouldn't ask any of this 
information.  I think there's an enormous benefit to providing something 
that takes basic sizing information and gives conservative guidelines 
based on that--as you say, safe, middle-of-the-road values--that are 
still way, way more useful than the default values.  The risk in trying to 
make a complicated tool that satisfies all the users Josh is aiming his 
more sophisticated effort at is that you'll lose the newbies.


Scan the archives of this mailing list for a bit.  If you look at what 
people discover they've being nailed by, it's rarely because they need to 
optimize something like random_page_cost.  It's usually because they have 
a brutally wrong value for one of the memory or vacuum parameters that are 
very easy to provide reasonable suggestions for without needing a lot of 
information about the server.


I wouldn't even bother asking how many CPUs somebody has for what Lance is 
building.  The kind of optimizations you'd do based on that are just too 
complicated to expect a tool to get them right and still be accessible to 
a novice.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-20 Thread Kevin Hunter

At 4:35p -0400 on 19 Jun 2007, Lance Campbell wrote:

The parameters I would think we should calculate are:
max_connections
shared_buffers
work_mem
maintenance_work_mem
effective_cache_size
random_page_cost


From an educational/newb standpoint, I notice that the page  
currently spews out a configuration file completely in line with  
what's currently there, comments and all.  May I suggest highlighting  
what has been altered, perhaps above or below the textbox?  It would  
make it immediately obvious, and easier to add an explanation of the  
thought process involved.  Something like


What's changed from the default:

li
pstrongmax_connections = 5/strong/p
	pThis follows directly from you put above.  It is the maximum  
number of concurrent connections Postgres will allow./p

/li
li
pstrongshared_buffers = 1/strong/p
	pThis setting will take some time to get exactly right for your  
needs.  Postgres uses this for .../p

/li

Not something that necessarily needs to be spelled out in the .conf  
file, but would, IMVHO, help minimally educate.


Kevin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Richard Huxton

Campbell, Lance wrote:

Please think simple.  I stress the word simple.  The real challenge here
is getting the formulas correct.  Someone mentioned to not focus on the
values but just get something out there for everyone to help tweak.  I
agree!

What questions do you think should be asked in order to figure out what
values should go into the formulas for the configuration suggestions?  


My thoughts:

What version of PostgreSQL are you using?


OK, obviously not needed if embedded in the manuals.

 How many connections will be made to PostgreSQL?
OK (but changed order)


How much memory will be available to PostgreSQL?

Would structure it like:
- What is total memory of your machine?
- How much do you want to reserve for other apps (e.g. apache/java)?

Also:
- How many disks will PG be using?
- How much data do you think you'll store?
- Will your usage be: mostly reads|balance of read+write|mostly writes
- Are your searches: all very simple|few complex|lots of complex queries

Then, with the output provide a commentary stating reasons why for the 
chosen values. e.g.

  random_page_cost = 1.0
  Because you have [effective_cache_size = 1GB] and [total db size = 
0.5GB] the cost of fetching a page is the same no matter what order you 
fetch them in.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Francisco Reyes

Campbell, Lance writes:

3) I suggested JavaScript because most people that get started with 
PostgreSQL will go to the web in order to find out about issues relating 


Why not c?
It could then go into contrib.
Anyways.. language is likely the least important issue..
As someone mentioned.. once the formulas are worked out it can be done in a 
few languages.. as people desire..



How much memory will be available to PostgreSQL?
How many connections will be made to PostgreSQL?


Will this be a dedicated Postgresql server?
Will there be mostly reads or will there also be significant amount of 
writes?


Are you on a RAID system or do you have several disks over which you would 
like to run postgresql on?


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Campbell, Lance
Francisco and Richard,
Why ask about disk or raid?  How would that impact any settings in
postgresql.conf?

I did forget the obvious question:

What OS are you using?

Thanks,

Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu
 

-Original Message-
From: Francisco Reyes [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, June 19, 2007 11:58 AM
To: Campbell, Lance
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

Campbell, Lance writes:

 3) I suggested JavaScript because most people that get started with 
 PostgreSQL will go to the web in order to find out about issues
relating 

Why not c?
It could then go into contrib.
Anyways.. language is likely the least important issue..
As someone mentioned.. once the formulas are worked out it can be done
in a 
few languages.. as people desire..

 How much memory will be available to PostgreSQL?
 How many connections will be made to PostgreSQL?

Will this be a dedicated Postgresql server?
Will there be mostly reads or will there also be significant amount of 
writes?

Are you on a RAID system or do you have several disks over which you
would 
like to run postgresql on?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Richard Huxton

Campbell, Lance wrote:

Francisco and Richard,
Why ask about disk or raid?  How would that impact any settings in
postgresql.conf?


Well, random_page_cost will depend on how fast your disk system can 
locate a non-sequential page. If you have a 16-disk RAID-10 array that's 
noticably less time than a single 5400rpm IDE in a laptop.



I did forget the obvious question:

What OS are you using?


Tricky to keep simple, isn't it :-)

--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread PFC




What version of PostgreSQL are you using?


	I think newbies should be pushed a bit to use the latest versions, maybe  
with some advice on how to setup the apt sources (in debian/ubuntu) to get  
them.



How much memory will be available to PostgreSQL?

How many connections will be made to PostgreSQL?


	I also think Postgres newbies using PHP should be encouraged to use  
something like ligttpd/fastcgi instead of Apache. The fastcgi model  
permits use of very few database connections and working PHP processes  
since lighttpd handles all the slow transfers to the client  
asynchronously. You can do the same with two Apache instances, one serving  
static pages and acting as a proxy for the second Apache serving dynamic  
pages.
	With this setup, even low-end server setups (For our personal sites, a  
friend and I share a dedicated server with 256MB of RAM, which we rent for  
20€ a month). This thing will never run 200 Apache processes, but we have  
no problem with lighttpd/php-fcgi and postgres.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Francisco Reyes

Campbell, Lance writes:


Francisco and Richard,
Why ask about disk or raid?  How would that impact any settings in
postgresql.conf?


If the user has 2 disks and says that he will do a lot of updates he could 
put pg_xlog in the second disk.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Y Sidhu

On 6/19/07, Francisco Reyes [EMAIL PROTECTED] wrote:


Campbell, Lance writes:

 Francisco and Richard,
 Why ask about disk or raid?  How would that impact any settings in
 postgresql.conf?

If the user has 2 disks and says that he will do a lot of updates he could
put pg_xlog in the second disk.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster



Let's not ask about disk or raid at this level of sanity tuning. It is
important for a newbie to take the right first step. When it comes to disks,
we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI,
and controller cards. Then we throw in RAID and the different levels
therein. Add to that, we can talk about drivers controlling these drives and
which OS is faster, more stable, etc. As you can see, a newbie would get
drowned. So, please keep it simple. I know many people on this list are
Gurus. We know you are the best in this field, but we are not and are just
trying to improve what we have.

--
Yudhvir Singh Sidhu
408 375 3134 cell


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Campbell, Lance
Yudhvir,

I completely agree.  I was just putting together a similar email.

 

Thanks,

 

Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu

 



From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Y Sidhu
Sent: Tuesday, June 19, 2007 12:49 PM
To: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

 

 

On 6/19/07, Francisco Reyes [EMAIL PROTECTED] wrote:

Campbell, Lance writes:

 Francisco and Richard,
 Why ask about disk or raid?  How would that impact any settings in
 postgresql.conf?

If the user has 2 disks and says that he will do a lot of updates he
could 
put pg_xlog in the second disk.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Let's not ask about disk or raid at this level of sanity tuning. It is
important for a newbie to take the right first step. When it comes to
disks, we start talking I/O, SATA, SCSI and the varying degrees of SATA
and SCSI, and controller cards. Then we throw in RAID and the different
levels therein. Add to that, we can talk about drivers controlling these
drives and which OS is faster, more stable, etc. As you can see, a
newbie would get drowned. So, please keep it simple. I know many people
on this list are Gurus. We know you are the best in this field, but we
are not and are just trying to improve what we have.

-- 
Yudhvir Singh Sidhu
408 375 3134 cell 



Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread D'Arcy J.M. Cain
On Tue, 19 Jun 2007 12:58:26 -0400
Francisco Reyes [EMAIL PROTECTED] wrote:
 Campbell, Lance writes:
  3) I suggested JavaScript because most people that get started with 
  PostgreSQL will go to the web in order to find out about issues relating 
 
 Why not c?

Why not whatever and install it on www.PostgreSQL.org?  Is there any
reason that this tool would need to be run on every installation.  Run
it on the site and it can always be up to date and can be written in
whatever language is easiest to maintain on the mother system.

I would also like to make a pitch for a JavaScript-free tool.  Just
collect all the pertinent information, work it out and display the
results in a second page.  Some people just don't like JavaScript and
turn it off even if we can run it in our browser.

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED] |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Campbell, Lance
D'Arcy,
I wanted to put it on the www.postgresql.org site.  That is what I said
in my original email.  I don't believe anyone from the actual project
has contacted me.

I am setting up a JavaScript version first.  If someone wants to do a
different one feel free.  I will have all of the calculations in the
JavaScript so it should be easy to do it in any language.

Thanks,

Lance Campbell
Project Manager/Software Architect
Web Services at Public Affairs
University of Illinois
217.333.0382
http://webservices.uiuc.edu
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of D'Arcy J.M.
Cain
Sent: Tuesday, June 19, 2007 12:32 PM
To: Francisco Reyes
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

On Tue, 19 Jun 2007 12:58:26 -0400
Francisco Reyes [EMAIL PROTECTED] wrote:
 Campbell, Lance writes:
  3) I suggested JavaScript because most people that get started with 
  PostgreSQL will go to the web in order to find out about issues
relating 
 
 Why not c?

Why not whatever and install it on www.PostgreSQL.org?  Is there any
reason that this tool would need to be run on every installation.  Run
it on the site and it can always be up to date and can be written in
whatever language is easiest to maintain on the mother system.

I would also like to make a pitch for a JavaScript-free tool.  Just
collect all the pertinent information, work it out and display the
results in a second page.  Some people just don't like JavaScript and
turn it off even if we can run it in our browser.

-- 
D'Arcy J.M. Cain [EMAIL PROTECTED] |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Y Sidhu

On 6/19/07, Campbell, Lance [EMAIL PROTECTED] wrote:


 Below is a link to the HTML JavaScript configuration page I am creating:



http://www.webservices.uiuc.edu/postgresql/



I had many suggestions.  Based on the feedback I received, I put together
the initial list of questions.  This list of questions can be change.



Memory

There are many different ways to ask about memory.  Rather than ask a
series of questions I went with a single question, #2.  If it is better to
ask about the memory in a series of questions then please give me the
questions you would ask and why you would ask each of them.  From my
understanding the primary memory issue as it relates to PostgreSQL is how
much memory is available to PostgreSQL.  Remember that this needs to be as
simple as possible.



My next step is to list the PostgreSQL parameters found in the
postgresql.conf file and how I will generate their values based on the
questions I have so far.  I will primarily focus on PostgreSQL 8.2.x.
Once I have a consensus from everyone then I will put functionality behind
the Generate Suggested Settings button.



Thanks for all of the feedback,





Lance Campbell

Project Manager/Software Architect

Web Services at Public Affairs

University of Illinois

217.333.0382

http://webservices.uiuc.edu




Lance,

Simply awesome!


--
Yudhvir Singh Sidhu
408 375 3134 cell


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Francisco Reyes

Campbell, Lance writes:

For the 6) Are your searches:
How about having many simple


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread david

On Tue, 19 Jun 2007, Y Sidhu wrote:


On 6/19/07, Francisco Reyes [EMAIL PROTECTED] wrote:


 Campbell, Lance writes:

  Francisco and Richard,
  Why ask about disk or raid?  How would that impact any settings in
  postgresql.conf?

 If the user has 2 disks and says that he will do a lot of updates he could
 put pg_xlog in the second disk.


 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster



Let's not ask about disk or raid at this level of sanity tuning. It is
important for a newbie to take the right first step. When it comes to disks,
we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI,
and controller cards. Then we throw in RAID and the different levels
therein. Add to that, we can talk about drivers controlling these drives and
which OS is faster, more stable, etc. As you can see, a newbie would get
drowned. So, please keep it simple. I know many people on this list are
Gurus. We know you are the best in this field, but we are not and are just
trying to improve what we have.


I strongly agree.

besides, the number and types of drives, raid configurations, etc is so 
variable that I strongly believe that the right answer is going to be 
something along the lines of 'run this tool and then enter the number(s) 
that the tool reports' and then let the tool measure the end result of all 
the variables rather then trying to calculate the results.


David Lang

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread david

On Tue, 19 Jun 2007, Campbell, Lance wrote:


Memory

There are many different ways to ask about memory.  Rather than ask a
series of questions I went with a single question, #2.  If it is better
to ask about the memory in a series of questions then please give me the
questions you would ask and why you would ask each of them.  From my
understanding the primary memory issue as it relates to PostgreSQL is
how much memory is available to PostgreSQL.  Remember that this needs
to be as simple as possible.


there are three catagories of memory useage

1. needed by other software
2. available for postgres
3. needed by the OS

it's not clear if what you are asking is #2 or a combination of #2 and #3

IMHO you should ask for #2 and #3, possibly along the lines of how much 
memory is in the machine that isn't already used by other applications


David Lang

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Josh Berkus

 there are three catagories of memory useage

 1. needed by other software
 2. available for postgres
 3. needed by the OS

There's actually only two required memory questions:

M1) How much RAM do you have on this machine?
M2) Is this:
() Dedicated PostgreSQL Server?
() Server shared with a few other applications?
() Desktop?

I don't think the mostly reads / mostly writes question covers anything,  
nor is it likely to produce accurate answers.  Instead, we need to ask the 
users to characterize what type of application they are running:

T1) Please characterize the general type of workload you will be running on 
this database.  Choose one of the following four:
() WEB: any scripting-language application which mainly needs to support 
90% or more data reads, and many rapid-fire small queries over a large 
number of connections. Examples: forums, content management systems, 
directories. 
() OLTP: this application involves a large number of INSERTs, UPDATEs and 
DELETEs because most users are modifying data instead of just reading it.  
Examples: accounting, ERP, logging tools, messaging engines.
() Data Warehousing: also called decision support and BI, these 
database support a fairly small number of large, complicated reporting 
queries, very large tables, and large batch data loads.
() Mixed/Other: if your application doesn't fit any of the above, our 
script will try to pick safe, middle-of-the-road values.

Hmmm, drop question (6) too.

(2) should read: What is the maximum number of database connections which 
you'll need to support?  If you don't know, we'll pick a default.

Other questions we need:

How many/how fast processors do you have?  Pick the option which seems 
closest to what you have:
() A single laptop processor
() Single or dual older processors (1ghz)
() Dual or quad current consumer processors (2ghz+)
() Large, recent multi-core server system

What OS Are You Using, of course, needs to have Linux, Solaris, BSD, OSX 
and Windows.  At some point, this tool will also need to generate for the 
user any shmem settings that they need to make on the OS.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Ray Stell
On Tue, Jun 19, 2007 at 10:49:01AM -0700, Y Sidhu wrote:
 On 6/19/07, Francisco Reyes [EMAIL PROTECTED] wrote:
 
 Campbell, Lance writes:
 
  Francisco and Richard,
  Why ask about disk or raid?  How would that impact any settings in
  postgresql.conf?
 
 If the user has 2 disks and says that he will do a lot of updates he could
 put pg_xlog in the second disk.
 
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 
 
 Let's not ask about disk or raid at this level of sanity tuning. It is
 important for a newbie to take the right first step. When it comes to disks,
 we start talking I/O, SATA, SCSI and the varying degrees of SATA and SCSI,
 and controller cards. Then we throw in RAID and the different levels
 therein. Add to that, we can talk about drivers controlling these drives and
 which OS is faster, more stable, etc. As you can see, a newbie would get
 drowned. So, please keep it simple. I know many people on this list are
 Gurus. We know you are the best in this field, but we are not and are just
 trying to improve what we have.



Ignoring the i/o subsystem in db configuration, there's an idea.

You could request some bonnie++ output (easy to aquire) as a baseline, 
do your magic analysis based on this, and skip it if it is not provided
with a warning.  Course the magic may be harder to come by. 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Tom Lane
PFC [EMAIL PROTECTED] writes:
 What version of PostgreSQL are you using?

   I think newbies should be pushed a bit to use the latest versions,

How about pushed *hard* ?  I'm constantly amazed at the number of people
who show up in the lists saying they installed 7.3.2 or whatever random
version they found in a dusty archive somewhere.  Please upgrade is at
least one order of magnitude more valuable configuration advice than
anything else we could tell them.

If the configurator is a live tool on the website, then it could be
aware of the latest release numbers and prod people with an appropriate
amount of urgency depending on how old they say their version is.  This
may be the one good reason not to provide it as a standalone program.

(No, we shouldn't make it try to phone home for latest release numbers
--- in the first place, that won't work if the machine is really
isolated from the net, and in the second place people will be suspicious
of the motives.)

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PERFORM] PostgreSQL Configuration Tool for Dummies

2007-06-19 Thread Josh Berkus
Lance,

 The parameters I would think we should calculate are:

 max_connections

 shared_buffers

 work_mem

 maintenance_work_mem

 effective_cache_size

 random_page_cost

Actually, I'm going to argue against messing with random_page_cost.  It's a 
cannon being used when a slingshot is called for.  Instead (and this was 
the reason for the What kind of CPU? question) you want to reduce the 
cpu_* costs.  I generally find that if cpu_* are reduced as appropriate to 
modern faster cpus, and effective_cache_size is set appropriately, a 
random_page_cost of 3.5 seems to work for appropriate choice of index 
scans.

If you check out my spreadsheet version of this:
http://pgfoundry.org/docman/view.php/1000106/84/calcfactors.sxc
... you'll see that the approach I found most effective was to create 
profiles for each of the types of db applications, and then adjust the 
numbers based on those.  

Other things to adjust:
wal_buffers
checkpoint_segments
commit_delay
vacuum_delay
autovacuum

Anyway, do you have a pgfoundry ID?  I should add you to the project.


-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend