Re: [PERFORM] About the relation between fragmentation of file and VACUUM

2005-12-01 Thread Bill McGonigle

On Dec 1, 2005, at 00:50, Tatsumi Abe wrote:


However, In my current environment I can't stop PostgreSQL and cancel
fragmentation.

Could anyone advise some solutions for this fragmentation problem
without stopping PostgreSQL ?


This is somewhat of an aside and intended just as a helpful suggestion 
since I've been in this spot before: if you have this kind of uptime 
requirement the first project to work on is getting the environment to 
the point where you can take out at least one database server at a time 
for maintenance.  You're going to be forced to do this sooner or later 
- whether by disk failure, software error (Pg or OS), user error 
(restore from backup) or security issues (must patch fixes).


So disk fragmentation is a great thing to worry about at some point, 
but IMHO you've got your neck under the guillotine and worrying about 
your cuticles.


I've heard the arguments before, usually around budget, and if the 
company can't spend any money but needs blood-from-stone performance 
tweaks, somebody isn't doing the math right (I'm assuming this isn't 
running on a satellite).  Plus, your blood pressure will go down when 
things are more resilient.  I've tried the superhero thing before and 
it's just not worth it.


-Bill
-
Bill McGonigle, Owner   Work: 603.448.4440
BFC Computing, LLC  Home: 603.448.1668
[EMAIL PROTECTED]   Mobile: 603.252.2606
http://www.bfccomputing.com/Pager: 603.442.1833
Jabber: [EMAIL PROTECTED]  Text: [EMAIL PROTECTED]
Blog: http://blog.bfccomputing.com/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Hardware/OS recommendations for large databases (

2005-11-21 Thread Bill McGonigle
Would it be worth first agreeing on a common set of criteria to 
measure?  I see many data points going back and forth but not much 
agreement on what's worth measuring and how to measure.


I'm not necessarily trying to herd cats, but it sure would be swell to 
have the several knowledgeable minds here come up with something that 
could uniformly tested on a range of machines, possibly even integrated 
into pg_bench or something.  Disagreements on criteria or methodology 
should be methodically testable.


Then I have dreams of a new pg_autotune that would know about these 
kinds of system-level settings.


I haven't been on this list for long, and only using postgres for a 
handful of years, so forgive it if this has been hashed out before.


-Bill
-
Bill McGonigle, Owner   Work: 603.448.4440
BFC Computing, LLC  Home: 603.448.1668
[EMAIL PROTECTED]   Mobile: 603.252.2606
http://www.bfccomputing.com/Pager: 603.442.1833
Jabber: [EMAIL PROTECTED]  Text: [EMAIL PROTECTED]
Blog: http://blog.bfccomputing.com/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PERFORM] Too Many OR's?

2005-11-15 Thread Bill McGonigle
I have a query that's making the planner do the wrong thing (for my 
definition of wrong) and I'm looking for advice on what to tune to make 
it do what I want.


The query consists or SELECT'ing a few fields from a table for a large 
number of rows.  The table has about seventy thousand rows and the user 
is selecting some subset of them.  I first do a SELECT...WHERE to 
determine the unique identifiers I want (works fine) and then I do a 
SELECT WHERE IN giving the list of id's I need additional data on 
(which I see from EXPLAIN just gets translated into a very long list of 
OR's).


Everything works perfectly until I get to 65301 rows.  At 65300 rows, 
it does an index scan and takes 2197.193 ms.  At 65301 rows it switches 
to a sequential scan and takes 778951.556 ms.  Values known not to 
affect this are: work_mem, effective_cache_size.  Setting 
random_page_cost from 4 to 1 helps (79543.214 ms) but I'm not really 
sure what '1' means, except it's relative.  Of course, setting 
'enable_seqscan false' helps immensely (2337.289 ms) but that's as 
inelegant of a solution as I've found - if there were other databases 
on this install that wouldn't be the right approach.


Now I can break this down into multiple SELECT's in code, capping each 
query at 65300 rows, and that's a usable workaround, but academically 
I'd like to know how to convince the planner to do it my way.  It's 
making a bad guess about something but I'm not sure what.  I didn't see 
any hard-coded limits grepping through the source (though it is close 
to the 16-bit unsigned boundry - probably coincidental) so if anyone 
has ideas or pointers to how I might figure out what's going wrong that 
would be helpful.


Thanks,
-Bill

-
Bill McGonigle, Owner   Work: 603.448.4440
BFC Computing, LLC  Home: 603.448.1668
[EMAIL PROTECTED]   Mobile: 603.252.2606
http://www.bfccomputing.com/Pager: 603.442.1833
Jabber: [EMAIL PROTECTED]  Text: [EMAIL PROTECTED]
Blog: http://blog.bfccomputing.com/


---(end of broadcast)---
TIP 6: explain analyze is your friend