Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-10-15 Thread Tobias Brox
[Matthew T. O'Connor - Wed at 02:33:10PM -0400]
 In addition autovacuum respects the work of manual or cron based 
 vacuums, so if you issue a vacuum right after a daily batch insert / 
 update, autovacuum won't repeat the work of that manual vacuum.

I was experimenting a bit with autovacuum now.  To make the least effect
possible, I started with a too high cost_delay/cost_limit-ratio.  The
effect of this was that autovacuum never finished the transactions it
started with, and this was actually causing the nightly vacuum to not do
it's job good enough.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-10-15 Thread Matthew T. O'Connor

Tobias Brox wrote:

[Matthew T. O'Connor - Wed at 02:33:10PM -0400]
  
In addition autovacuum respects the work of manual or cron based 
vacuums, so if you issue a vacuum right after a daily batch insert / 
update, autovacuum won't repeat the work of that manual vacuum.



I was experimenting a bit with autovacuum now.  To make the least effect
possible, I started with a too high cost_delay/cost_limit-ratio.  The
effect of this was that autovacuum never finished the transactions it
started with, and this was actually causing the nightly vacuum to not do
it's job good enough.


Yeah, I think if the delay settings are too high it can cause problems, 
that's part of the reason we have yet to turn these on be default since 
we won't have enough data to suggest good values.  Can you tell us what 
settings you finally settled on?


BTW hopefully for 8.3 we are going to add the concept of maintenance 
windows to autovacuum, during these periods you can lower the thresholds 
and perhaps even change the delay settings to make autovacuum more 
aggressive during the maintenance window.  This hopefully will just 
about eliminate the need for nightly cron based vacuum runs.



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [ADMIN] [PERFORM] autovacuum on a -mostly- r/o table

2006-10-15 Thread Tobias Brox
[Matthew T. O'Connor - Sun at 10:42:34AM -0400]
 Yeah, I think if the delay settings are too high it can cause problems, 
 that's part of the reason we have yet to turn these on be default since 
 we won't have enough data to suggest good values.  Can you tell us what 
 settings you finally settled on?

I'm still not yet settled, and the project manager is critical to
autovacuum (adds complexity, no obvious benefits from it, we see from
the CPU graphs that it's causing iowait, iowait is bad).  We're going to
run autovacuum biweekly now to see what effect it has on the server
load.

I've been using the cost/delay-setting of 200/200 for a week now, and
I'm going to continue with 100/150 for a while.  

Are there any known disadvantages of lowering both values to the extreme
- say, 20/20 instead of 200/200?  That would efficiently mean sleep as
often as possible, and sleep for 1 ms for each cost unit spent if
I've understood the system right.

Are there any logs that can help me, and eventually, are there any
ready-made scripts for checking when autovacuum is running, and
eventually for how long it keeps its transactions?  I'll probably write
up something myself if not.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-15 Thread Carlo Stonebanks
Hi Merlin,

Well, I'm back. first of all, thanks for your dogged determination to help 
me out - it is much appreciated. I owe you a beer or twelve.

The import has been running for a week. The import program got faster as I 
tuned things. I capture the dynamic SQL statements generated by the app, as 
well as an accompanying EXPLAIN - and put it out to an XML file. I turned 
off seq scan in the config, and ran a trial import. I knew that with seq 
scan off that if I saw a seq scan in my log, it's because there were no 
indexes available to satisfy the query - I adjusted accordingly and this 
worked really well.

When the import runs against an empty or small db, it's blisteringly fast 
(considering that it's a heauristically based process). This proved that it 
wasn't the app or the SQL connection that was slow. Once again, though, as 
the data db grows, it slows down. Now it's crawling again. All of the 
queries appear to be fine, taking advantage of the indexes. There is ONE 
query, though, that seems to be the troublemaker - the same one I had 
brought up before. I believe that it is one sub-query that is causing the 
problem, taking what appears to be 500 to 1000+ms to run every time. (See 
below).

Curiously, it's using index scans, and it really looks like a simple query 
to me. I am completely baffled. The two tables in question have about 800K 
rows each - not exactly an incredible number. The EXPLAIN is simple, but the 
performance is dreadful. All the other queries run much faster than this - 
does ANYTHING about this query strike you as odd?

Carlo

/*
Find all facilities that do not have full address information
but do have default location information that indicates
its the facilitiy's US zip code.
NULL values cast as columns are placeholders to allow
this sub-query to be unioned with another subquery
that contains full address data
*/
select
f.facility_id,
null as facility_address_id,
null as address_id,
f.facility_type_code,
f.name,
null as address,
f.default_city as city,
f.default_state_code as state_code,
f.default_postal_code as postal_code,
f.default_country_code as country_code,
null as parsed_unit
from
mdx_core.facility as f
left outer join mdx_core.facility_address as fa
on fa.facility_id = f.facility_id
where
 facility_address_id is null
 and f.default_country_code = 'US'
 and (f.default_postal_code = '14224-1945' or f.default_postal_code = 
'14224')

Nested Loop Left Join  (cost=22966.70..23594.84 rows=93 width=71) (actual 
time=662.075..662.075 rows=0 loops=1)
  Filter: (inner.facility_address_id IS NULL)
  -  Bitmap Heap Scan on facility f  (cost=22966.70..23231.79 rows=93 
width=71) (actual time=661.907..661.929 rows=7 loops=1)
Recheck Cond: (((default_country_code = 'US'::bpchar) AND 
((default_postal_code)::text = '14224-1945'::text)) OR 
((default_country_code = 'US'::bpchar) AND ((default_postal_code)::text = 
'14224'::text)))
-  BitmapOr  (cost=22966.70..22966.70 rows=93 width=0) (actual 
time=661.891..661.891 rows=0 loops=1)
  -  Bitmap Index Scan on 
facility_country_state_postal_code_idx  (cost=0.00..11483.35 rows=47 
width=0) (actual time=374.284..374.284 rows=7 loops=1)
Index Cond: ((default_country_code = 'US'::bpchar) AND 
((default_postal_code)::text = '14224-1945'::text))
  -  Bitmap Index Scan on 
facility_country_state_postal_code_idx  (cost=0.00..11483.35 rows=47 
width=0) (actual time=287.599..287.599 rows=0 loops=1)
Index Cond: ((default_country_code = 'US'::bpchar) AND 
((default_postal_code)::text = '14224'::text))
  -  Index Scan using facility_address_facility_address_address_type_idx 
on facility_address fa  (cost=0.00..3.89 rows=1 width=8) (actual 
time=0.014..0.016 rows=1 loops=7)
Index Cond: (fa.facility_id = outer.facility_id)
Total runtime: 662.203 ms
 



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-15 Thread Tom Lane
Carlo Stonebanks [EMAIL PROTECTED] writes:
 Curiously, it's using index scans, and it really looks like a simple query 
 to me. I am completely baffled. The two tables in question have about 800K 
 rows each - not exactly an incredible number. The EXPLAIN is simple, but the 
 performance is dreadful. All the other queries run much faster than this - 
 does ANYTHING about this query strike you as odd?

Lots of dead rows perhaps?  The EXPLAIN estimates look a bit out of line
--- 11483 cost units to fetch 47 index entries is an order or two of
magnitude higher than it ought to be.  The real time also seems to be
concentrated in that index scan.  What are the physical sizes of the
table and index?  (VACUUM VERBOSE output for the facility table might
tell something.)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance Optimization for Dummies 2 - the SQL

2006-10-15 Thread Carlo Stonebanks
Hey Tom, thanks for jumping in. Nothing on TV on a Sunday afternoon? ;-) 
Appreciate teh input.

Here is vacuum verbose output for both the tables in question.

Carlo


INFO:  vacuuming mdx_core.facility
INFO:  index facility_pkey now contains 832399 row versions in 3179 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.09s/0.04u sec elapsed 0.21 sec.
INFO:  index facility_country_state_city_idx now contains 832444 row 
versions in 6630 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.15s/0.07u sec elapsed 43.81 sec.
INFO:  index facility_country_state_postal_code_idx now contains 832499 
row versions in 6658 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.23s/0.07u sec elapsed 0.37 sec.
INFO:  facility: found 0 removable, 832398 nonremovable row versions in 
15029 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.67s/0.32u sec elapsed 44.71 sec.
INFO:  vacuuming pg_toast.pg_toast_58570311
INFO:  index pg_toast_58570311_index now contains 0 row versions in 1 
pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  pg_toast_58570311: found 0 removable, 0 nonremovable row versions 
in 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.

Query returned successfully with no result in 44875 ms.

INFO:  vacuuming mdx_core.facility_address
INFO:  index facility_address_pkey now contains 772770 row versions in 
2951 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.10s/0.04u sec elapsed 9.73 sec.
INFO:  index facility_address_address_idx now contains 772771 row versions 
in 2750 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.04s/0.04u sec elapsed 0.34 sec.
INFO:  index facility_address_facility_address_address_type_idx now 
contains 772773 row versions in 3154 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.01s/0.04u sec elapsed 0.06 sec.
INFO:  facility_address: found 0 removable, 772747 nonremovable row 
versions in 7969 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.39s/0.18u sec elapsed 10.70 sec.

Query returned successfully with no result in 10765 ms.




Tom Lane [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Carlo Stonebanks [EMAIL PROTECTED] writes:
 Curiously, it's using index scans, and it really looks like a simple 
 query
 to me. I am completely baffled. The two tables in question have about 
 800K
 rows each - not exactly an incredible number. The EXPLAIN is simple, but 
 the
 performance is dreadful. All the other queries run much faster than 
 this -
 does ANYTHING about this query strike you as odd?

 Lots of dead rows perhaps?  The EXPLAIN estimates look a bit out of line
 --- 11483 cost units to fetch 47 index entries is an order or two of
 magnitude higher than it ought to be.  The real time also seems to be
 concentrated in that index scan.  What are the physical sizes of the
 table and index?  (VACUUM VERBOSE output for the facility table might
 tell something.)

 regards, tom lane

 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Hints proposal

2006-10-15 Thread Craig A. James

Josh Berkus wrote:
I actually think the way to attack this issue is to discuss the kinds of 
errors the planner makes, and what tweaks we could do to correct them. 
Here's the ones I'm aware of:


-- Incorrect selectivity of WHERE clause
-- Incorrect selectivity of JOIN
-- Wrong estimate of rows returned from SRF
-- Incorrect cost estimate for index use

Can you think of any others?


The one that started this discussion: Lack of cost information for functions.  
I think this feature is a good idea independent of the whole HINTS discussion.

At a minimum, a rough categorization is needed, such as Lighning fast / Fast / Medium / Slow 
/ Ludicrously slow, with some sort if milliseconds or CPU cycles associated with each 
category.  Or perhaps something like, This is (much faster|faster|same as|slower|much slower) 
than reading a block from the disk.

If I understand Tom and others, the planner already is capable of taking 
advantage of this information, it just doesn't have it yet.  It could be part 
of the CREATE FUNCTION command.

  CREATE OR REPLACE FUNCTION foobar(text, text, text) RETURNS text
 AS '/usr/local/pgsql/lib/foobar.so', 'foobar'
 COST LUDICROUSLY_SLOW
 LANGUAGE 'C' STRICT;

Better yet ('tho I have no idea how hard this would be to implement...) would 
be an optional second function with the same parameter signature as the main 
function, but it would return a cost estimate:

  CREATE OR REPLACE FUNCTION foobar(text, text, text) RETURNS text
 AS '/usr/local/pgsql/lib/foobar.so', 'foobar'
 COST foobar_cost
 LANGUAGE 'C' STRICT;

The planner could call it with the same parameters it was about to use, and get 
an accurate estimate for the specific operation that is about to be done.  In 
my particular case (running an NP-complete problem), there are cases where I 
can determine ahead of time that the function will be fast, but in most cases 
it is *really* slow.

Craig

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Hints proposal

2006-10-15 Thread Craig A. James

So let's cut to the bone: If someone thinks a proposal is a bad idea, and 
they're volunteering their time on an open-source project, why would they 
implement the proposal?

In all the heat and smoke, I believe there are two basic conclusions we all 
agree on.

1. Optimizer:
  a) A perfect optimizer would be a wonderful thing
  b) Optimization is a hard problem
  c) Any problem that can be solve by improving the optimizer *should*
 be solved by improving the optimizer.

2. Hints
  a) On a aesthetic/theoretical level, hints suck.  They're ugly and rude
  b) On a practical level, introducing hints will cause short- and long-term 
problems
  c) Hints would help DBAs solve urgent problems for which there is no other 
solution

The disagreements revolve around the degree to which 1 conflicts with 2.

1. Developers feel very strongly about 2(a) and 2(b).
2. DBAs in the trenches feel very strongly about 2(c).

So my question is: Is there any argument that can be made to persuade those of 
you who are volunteering your time on the optimizer to even consider a HINTS 
proposal?  Has all this discussion changed your perspective on 2(c), and why it 
really matters to some of us?  Are we just wasting our time, or is this a 
fruitful discussion?

Thanks,
Craig

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match