Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-05-05 Thread Mark Kirkwood

Josh Berkus wrote:

Sebastian,


Before inventing a hyper tool, we might consider to provide 3-5 example
szenarios for common hardware configurations. This consumes less time
and be discussed and defined in a couple of days. This is of course not
the correct option for a brandnew 20 spindle Sata 10.000 Raid 10 system
but these are probably not the target for default configurations.


That's been suggested a number of times, but some GUCs are really tied to the 
*exact* amount of RAM you have available.  So I've never seen how example 
configurations could help.




I'm not convinced about this objection - having samples gives a bit of a 
heads up on *what* knobs you should at least look at changing.


Also it might be helpful on the -general or -perf lists to be able to 
say try config 3 (or whatever we call 'em) and see what changes...


I've certainly found the sample config files supplied with that database 
whose name begins with 'M' a useful *start* when I want something better 
than default...


Cheers

Mark

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-05-05 Thread Sebastian Hennebrueder
Mark Kirkwood schrieb:
 Josh Berkus wrote:
 Sebastian,

 Before inventing a hyper tool, we might consider to provide 3-5 example
 szenarios for common hardware configurations. This consumes less time
 and be discussed and defined in a couple of days. This is of course not
 the correct option for a brandnew 20 spindle Sata 10.000 Raid 10 system
 but these are probably not the target for default configurations.

 That's been suggested a number of times, but some GUCs are really
 tied to the *exact* amount of RAM you have available.  So I've never
 seen how example configurations could help.


 I'm not convinced about this objection - having samples gives a bit of
 a heads up on *what* knobs you should at least look at changing.

 Also it might be helpful on the -general or -perf lists to be able to
 say try config 3 (or whatever we call 'em) and see what changes...

 I've certainly found the sample config files supplied with that
 database whose name begins with 'M' a useful *start* when I want
 something better than default...

 Cheers

 Mark

Some ideas about szenarios and setting. This is meant as a discussion
proposal, I am by far not a database guru!
The settings do not provide a perfect setup but a more efficient as
compared to default setup.

criterias:
free memory
cpu ? what is the consequence?
separate spindels
total connections
Windows/linux/soloars ?

adapted settings:
max_connections
shared_buffers
effective_cache_size
/work_mem
//maintenance_work_mem

/checkpoint_segments ?
checkpoint_timeout ?
checkpoint_warning ?


Szenario a) 256 MB free memory, one disk or raid where all disks are in
the raid,
max_connections = 40
shared_buffers = 64MB
effective_cache_size = 180 MB
/work_mem = 1 MB
//maintenance_work_mem = 4 MB
/

Szenario b) 1024 MB free memory, one disk or raid where all disks are in
the raid
max_connections = 80
shared_buffers = 128 MB
effective_cache_size = 600 MB
/work_mem = 1,5 MB
//maintenance_work_mem = 16 MB
/
Szenario c) 2048 MB free memory, one disk or raid where all disks are in
the raid
max_connections = 160
shared_buffers = 256 MB
effective_cache_size = 1200 MB
/work_mem = 2 MB
//maintenance_work_mem = 32 MB
/
Szenario d) 2048 MB free memory, raid of multiple discs, second raid or
disk
max_connections = 160
shared_buffers = 256 MB
effective_cache_size = 1200 MB
/work_mem = 2 MB/
/maintenance_work_mem = 32 MB
/WAL on second spindle






---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] How to Find Cause of Long Vacuum Times - NOOB Question

2007-05-05 Thread Yudhvir Singh Sidhu
I hope someone can help me with this vacuum problem. I can post more 
info if needed.


Versions:  Postgresql version 8.09 on FreeBSD 6.1
Situation:  huge amounts of adds and deletes daily. Running daily vacuums
Problem:  Vacuum times jump up from 45 minutes, or 1:30 minutes to 6+ 
hours overnight, once every 1 to 3 months.
Solutions tried:  db truncate - brings vacuum times down. Reindexing 
brings vacuum times down.


I know my indexes are getting fragmented and my tables are getting 
fragmented. I also know that some of my btree indexes are not being used 
in queries. I also know that using UNIQUE in a query makes PG ignore 
any index.


I am looking for the cause of this. Recently I have been looking at 
EXPLAIN and ANALYZE.
1.  Running EXPLAIN on a query tells me how my query SHOULD run and 
running ANALYZE tells me how it DOES run. Is that correct?
2.  If (1) is true, then a difference between the two means my query 
plan is messed up and running ANALYZE on a table-level will somehow 
rebuild the plan. Is that correct?
3.  If (2) is correct, then  running ANALYZE on a nightly basis before 
running vacuum will keep vacuum times down. Is that correct?


Yudhvir Singh

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] How to Find Cause of Long Vacuum Times - NOOB Question

2007-05-05 Thread Steinar H. Gunderson
On Sat, May 05, 2007 at 03:57:25PM -0700, Yudhvir Singh Sidhu wrote:
 Situation:  huge amounts of adds and deletes daily. Running daily vacuums

If you have huge amounts of adds and deletes, you might want to vacuum more
often; optionally, look into autovacuum.

 Problem:  Vacuum times jump up from 45 minutes, or 1:30 minutes to 6+ 
 hours overnight, once every 1 to 3 months.

You might want to check your FSM settings. Take a look at the output of
VACUUM VERBOSE and see how the results stack up against your FSM settings.
Optionally, you could do a VACUUM FULL to clear the bloat, but this will lock
the tables and is not recommended on a regular basis.

 I know my indexes are getting fragmented and my tables are getting 
 fragmented. 

This sounds like a case of table bloat, ie. vacuuming too seldom and/or too
low FSM settings.

 I also know that some of my btree indexes are not being used in queries.

This is a separate problem, usually; if you need help with a specific query,
post query and the EXPLAIN ANALYZE output here. (Note that using an index is
not always a win; Postgres' planner knows about this and tries to figure out
when it is a win and when it is not.)

 I also know that using UNIQUE in a query makes PG ignore any index.

Do you mean DISTINCT? There are known problems with SELECT DISTINCT, but I'm
not sure how it could make Postgres start ignoring an index. Again, it's a
separate problem.

 I am looking for the cause of this. Recently I have been looking at 
 EXPLAIN and ANALYZE.

This is a good beginning. :-)

 1.  Running EXPLAIN on a query tells me how my query SHOULD run and 
 running ANALYZE tells me how it DOES run. Is that correct?

Nearly. EXPLAIN tells you how the plan Postgres has chosen, with estimates on
the costs of each step. EXPLAIN ANALYZE (just plain ANALYZE is a different
command, which updates the planner's statistics) does the same, but also runs
the query and shows the time each step ended up taking. (Note that the
units of the estimates and the timings are different, so you can't compare
them directly.)

 2.  If (1) is true, then a difference between the two means my query 
 plan is messed up and running ANALYZE on a table-level will somehow 
 rebuild the plan. Is that correct?

Again, sort of right, but not entirely. ANALYZE updates the planner's
statistics. Having good statistics is very useful for the planner in
selecting the plan that actually ends up being the best.

 3.  If (2) is correct, then  running ANALYZE on a nightly basis before 
 running vacuum will keep vacuum times down. Is that correct?

No, ANALYZE will only update planner statistics, which has nothing to do with
vacuum times. On the other hand, it might help with some of your queries.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] How to Find Cause of Long Vacuum Times - NOOB Question

2007-05-05 Thread Yudhvir Singh Sidhu

Steinar H. Gunderson wrote:

On Sat, May 05, 2007 at 03:57:25PM -0700, Yudhvir Singh Sidhu wrote:
  

Situation:  huge amounts of adds and deletes daily. Running daily vacuums



If you have huge amounts of adds and deletes, you might want to vacuum more
often; optionally, look into autovacuum.

  
Problem:  Vacuum times jump up from 45 minutes, or 1:30 minutes to 6+ 
hours overnight, once every 1 to 3 months.



You might want to check your FSM settings. Take a look at the output of
VACUUM VERBOSE and see how the results stack up against your FSM settings.
Optionally, you could do a VACUUM FULL to clear the bloat, but this will lock
the tables and is not recommended on a regular basis.

  
I know my indexes are getting fragmented and my tables are getting 
fragmented. 



This sounds like a case of table bloat, ie. vacuuming too seldom and/or too
low FSM settings.

  

I also know that some of my btree indexes are not being used in queries.



This is a separate problem, usually; if you need help with a specific query,
post query and the EXPLAIN ANALYZE output here. (Note that using an index is
not always a win; Postgres' planner knows about this and tries to figure out
when it is a win and when it is not.)

  

I also know that using UNIQUE in a query makes PG ignore any index.



Do you mean DISTINCT? There are known problems with SELECT DISTINCT, but I'm
not sure how it could make Postgres start ignoring an index. Again, it's a
separate problem.

  
I am looking for the cause of this. Recently I have been looking at 
EXPLAIN and ANALYZE.



This is a good beginning. :-)

  
1.  Running EXPLAIN on a query tells me how my query SHOULD run and 
running ANALYZE tells me how it DOES run. Is that correct?



Nearly. EXPLAIN tells you how the plan Postgres has chosen, with estimates on
the costs of each step. EXPLAIN ANALYZE (just plain ANALYZE is a different
command, which updates the planner's statistics) does the same, but also runs
the query and shows the time each step ended up taking. (Note that the
units of the estimates and the timings are different, so you can't compare
them directly.)

  
2.  If (1) is true, then a difference between the two means my query 
plan is messed up and running ANALYZE on a table-level will somehow 
rebuild the plan. Is that correct?



Again, sort of right, but not entirely. ANALYZE updates the planner's
statistics. Having good statistics is very useful for the planner in
selecting the plan that actually ends up being the best.

  
3.  If (2) is correct, then  running ANALYZE on a nightly basis before 
running vacuum will keep vacuum times down. Is that correct?



No, ANALYZE will only update planner statistics, which has nothing to do with
vacuum times. On the other hand, it might help with some of your queries.

/* Steinar */
  
Gee Wow. I am so glad I looked into this subject. I think I am onto the 
right path in solving the long-running vacuum problem. Thanks a lot for 
the detailed insight Steinar.


Here is what I think the story is:
a.  Large amounts of rows are added to and deleted from a table - daily. 
With this much activity, the statistics get out of whack easily. That's 
where ANALYZE or VACUUM ANALYZE would help with query speed.
b.  If ANALYZE does not have a direct impact on vacuum times, what does? 
Meaning what in this EXPLAIN/ANALYZE and Indexing world would have a 
direct impact?


Again, thank you Steinar for validating my suspicion. It is great to be 
on the right path.


Yudhvir




Here is another command and I suspect does something different than 
ANALYZE by itself:  VACUUM ANALYZE.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Feature Request --- was: PostgreSQL Performance Tuning

2007-05-05 Thread Jim Nasby

On May 4, 2007, at 12:11 PM, Josh Berkus wrote:

Sebastian,
Before inventing a hyper tool, we might consider to provide 3-5  
example

szenarios for common hardware configurations. This consumes less time
and be discussed and defined in a couple of days. This is of  
course not
the correct option for a brandnew 20 spindle Sata 10.000 Raid 10  
system

but these are probably not the target for default configurations.


That's been suggested a number of times, but some GUCs are really  
tied to the
*exact* amount of RAM you have available.  So I've never seen how  
example

configurations could help.


Uh... what GUCs are that exacting on the amount of memory? For a  
decent, base-line configuration, that is.

--
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 6: explain analyze is your friend