Re: [PERFORM] understanding postgres issues/bottlenecks

2009-01-13 Thread Jeff


On Jan 11, 2009, at 9:43 PM, M. Edward (Ed) Borasky wrote:


Luke Lonergan wrote:
Not to mention the #1 cause of server faults in my experience: OS  
kernel bug causes a crash.  Battery backup doesn't help you much  
there.




Not that long ago (a month or so) we ran into a problem where hpacucli  
(Utility for configuring/inspecting/etc HP smartarray controllers)  
would tickle the cciss driver in such a way that it would  cause a  
kernel panic. KABLAMMO (No data loss! we!).   The box had run for  
a long time without crashes, but it seems that when we added more  
disks and started the array building the new logical drive some  
magical things happened.


Bugs happen.  The [bad word] of it is catching the culprit with its  
fingers in the cookie jar.


--
Jeff Trout j...@jefftrout.com
http://www.stuarthamm.net/
http://www.dellsmartexitin.com/




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5

2009-01-13 Thread Kevin Grittner
 Mark Wong mark...@gmail.com wrote: 
 
 It appears to peak around 220 database connections:
 
 http://pugs.postgresql.org/node/514
 
Interesting.  What did you use for connection pooling?
 
My tests have never stayed that flat as the connections in use
climbed.  I'm curious why we're seeing such different results.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow insert performace, 8.3 Wal related?

2009-01-13 Thread Alan Hodgson
On Monday 12 January 2009, Bill Preston billpres...@crownepointe.net 
wrote:
 As to the second example with the delete. There are no foreign keys.
 For the index. If the table has fields a,b,c and d.
 We have a btree index (a,b,c,d)
 and we are saying DELETE FROM table_messed_up WHERE a=x.


Is there anything special about this table? Does it have like a hundred 
indexes on it or something? Because deleting 8k rows from a normal table 
should never take more than a couple of seconds.

-- 
Alan

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] strange index behaviour with different statistics target

2009-01-13 Thread Jeff Frost
So, I had a query that uses a postgis geometry index and the planner was
underestimating the number of rows it would return.  Because of this,
the planner was choosing the geometry index over a compound index on the
other columns in the WHERE clause.  So, I thought, let me increase the
stats target for that geometry column.  I did, and I got a different
(and better) plan, but when I looked at the estimates for the simplified
query against the geometry column alone, I noticed that neither the cost
nor the estimated rows changed:

oitest=# explain ANALYZE  SELECT * FROM blips WHERE
((ST_Contains(blips.shape,
'010120E61049111956F1EB55C0A8E49CD843F34440')) );

 
QUERY
PLAN
  

---
 Index Scan using index_blips_on_shape_gist on blips  (cost=0.00..7.33
rows=1 width=13804) (actual time=0.113..745.394 rows=2827 loops=1)
   Index Cond: (shape 
'010120E61049111956F1EB55C0A8E49CD843F34440'::geometry)
   Filter: ((shape 
'010120E61049111956F1EB55C0A8E49CD843F34440'::geometry) AND
_st_contains(shape,
'010120E61049111956F1EB55C0A8E49CD843F34440'::geometry))
 Total runtime: 745.977 ms
(4 rows)

Time: 747.199 ms
oitest=# ALTER TABLE blips ALTER COLUMN shape SET statistics 1000;
ALTER TABLE
Time: 0.478 ms
oitest=# ANALYZE ;
ANALYZE
Time: 7727.097 ms
oitest=# explain ANALYZE  SELECT * FROM blips WHERE
((ST_Contains(blips.shape,
'010120E61049111956F1EB55C0A8E49CD843F34440')) );

 
QUERY
PLAN
  

---
 Index Scan using index_blips_on_shape_gist on blips  (cost=0.00..7.33
rows=1 width=13761) (actual time=0.117..755.781 rows=2827 loops=1)
   Index Cond: (shape 
'010120E61049111956F1EB55C0A8E49CD843F34440'::geometry)
   Filter: ((shape 
'010120E61049111956F1EB55C0A8E49CD843F34440'::geometry) AND
_st_contains(shape,
'010120E61049111956F1EB55C0A8E49CD843F34440'::geometry))
 Total runtime: 756.396 ms
(4 rows)

The width changed slightly, but the cost is 7.33 in both.

So, now I thought how could that have changed the plan?  Did the other
parts of the plan estimate change?  So I pulled the shape column out of
the where clause and left the others:

oitest=# ALTER TABLE blips ALTER COLUMN shape SET statistics 100;
ALTER TABLE
Time: 0.475 ms
oitest=# ANALYZE ;
ANALYZE
Time: 1225.325 ms
oitest=# explain ANALYZE  SELECT * FROM blips WHERE
(blips.content_id = 2410268 AND blips.content_type = E'Story');
 
QUERY
PLAN 
--
 Index Scan using index_blips_on_content_type_and_content_id on blips 
(cost=0.00..9.01 rows=2 width=13924) (actual time=0.026..0.027 rows=2
loops=1)
   Index Cond: (((content_type)::text = 'Story'::text) AND (content_id =
2410268))
 Total runtime: 0.046 ms
(3 rows)

Time: 1.111 ms
oitest=# ALTER TABLE blips ALTER COLUMN shape SET statistics 1000;
ALTER TABLE
Time: 0.506 ms
oitest=# ANALYZE ;
ANALYZE
Time: 7785.496 ms
oitest=# explain ANALYZE  SELECT * FROM blips WHERE
(blips.content_id = 2410268 AND blips.content_type = E'Story');
 QUERY
PLAN 
-
 Index Scan using index_blips_on_content_id on blips  (cost=0.00..7.29
rows=1 width=13761) (actual time=0.013..0.014 rows=2 loops=1)
   Index Cond: (content_id = 2410268)
   Filter: ((content_type)::text = 'Story'::text)
 Total runtime: 0.034 ms
(4 rows)

Time: 1.007 ms

So, my question is, should changing the stats target on the shape column
affect the stats for the content_id and content_type columns?  Also, why
does the index on content_id win out over the compound index on
(content_type, content_id)?

index_blips_on_content_id btree (content_id)
index_blips_on_content_type_and_content_id btree (content_type,
content_id)

-- 
Jeff Frost, Owner   j...@frostconsultingllc.com
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 916-647-6411 FAX: 916-405-4032


-- 
Sent via pgsql-performance mailing list 

Re: [PERFORM] strange index behaviour with different statistics target

2009-01-13 Thread Tom Lane
Jeff Frost j...@frostconsultingllc.com writes:
 So, my question is, should changing the stats target on the shape column
 affect the stats for the content_id and content_type columns?

It would change the size of the sample for the table, which might
improve the accuracy of the stats.  IIRC you'd still get the same number
of histogram entries and most-common-values for the other columns, but
they might be more accurate.

 Also, why does the index on content_id win out over the compound index
 on (content_type, content_id)?

It's deciding (apparently correctly, from the explain results) that the
larger index isn't increasing the selectivity enough to be worth its
extra search cost.  I suppose content_type = 'Story' isn't very
selective in this table?

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange index behaviour with different statistics target

2009-01-13 Thread Jeff Frost

On Tue, 13 Jan 2009, Tom Lane wrote:


Jeff Frost j...@frostconsultingllc.com writes:

So, my question is, should changing the stats target on the shape column
affect the stats for the content_id and content_type columns?


It would change the size of the sample for the table, which might
improve the accuracy of the stats.  IIRC you'd still get the same number
of histogram entries and most-common-values for the other columns, but
they might be more accurate.


Why would they be more accurate?  Do they somehow correlate with the other 
column's histogram and most-common-values when the stats target is increased 
on that column?


The planner is choosing a plan I like for the query, I'm just trying to 
understand why it's doing that since the planner thinks the gist index is 
going to give it a single row (vs the 2827 rows it actually gets) and the fact 
that the cost didn't change for perusing the gist index.  I guess I was 
expecting the estimated rowcount and cost for perusing the gist index to go up 
and when it didn't I was pleasantly surprised to find I got a plan I wanted 
anyway.





Also, why does the index on content_id win out over the compound index
on (content_type, content_id)?


It's deciding (apparently correctly, from the explain results) that the
larger index isn't increasing the selectivity enough to be worth its
extra search cost.  I suppose content_type = 'Story' isn't very
selective in this table?


Ah!  You're right, especially with this content_id!

--
Jeff Frost, Owner   j...@frostconsultingllc.com
Frost Consulting, LLC   http://www.frostconsultingllc.com/
Phone: 916-647-6411 FAX: 916-405-4032

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] strange index behaviour with different statistics target

2009-01-13 Thread Tom Lane
Jeff Frost j...@frostconsultingllc.com writes:
 On Tue, 13 Jan 2009, Tom Lane wrote:
 It would change the size of the sample for the table, which might
 improve the accuracy of the stats.  IIRC you'd still get the same number
 of histogram entries and most-common-values for the other columns, but
 they might be more accurate.

 Why would they be more accurate?

They'd be drawn from a larger sample of the table rows.  If we need a
random sample of N rows for the largest stats target among the columns,
we use all those rows for deriving the stats for the other columns too.

 The planner is choosing a plan I like for the query, I'm just trying to 
 understand why it's doing that since the planner thinks the gist index is 
 going to give it a single row (vs the 2827 rows it actually gets) and the 
 fact 
 that the cost didn't change for perusing the gist index.

You'd need to ask the postgis guys whether they have an estimator for
ST_Contains that actually does anything useful.  I haven't the foggiest
what the state of their stats support is.

[ looks again at the plan... ]  Actually it looks like the estimator
for  is what's at issue.  Estimators are attached to operators not
functions.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] dbt-2 tuning results with postgresql-8.3.5

2009-01-13 Thread Mark Wong
On Tue, Jan 13, 2009 at 7:40 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Mark Wong mark...@gmail.com wrote:

 It appears to peak around 220 database connections:

 http://pugs.postgresql.org/node/514

 Interesting.  What did you use for connection pooling?

It's a fairly dumb but custom built C program for the test kit:

http://git.postgresql.org/?p=~markwkm/dbt2.git;a=summary

I think the bulk of the logic is in src/client.c, src/db_threadpool.c,
and src/transaction_queue.c.

 My tests have never stayed that flat as the connections in use
 climbed.  I'm curious why we're seeing such different results.

I'm sure the difference in workloads makes a difference.  Like you
implied earlier, I think we have to figure out what works best in for
our own workloads.

Regards,
Mark

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] index

2009-01-13 Thread Maksim Sosnovskiy
We are currently storing a large amount of networking data. The
database size is over 50 Gigabytes. It also grows by 10 Gigabytes
every month.

We are looking if there is a way to speedup lookups by IP Address.

The primary key is a set of values. And the key does include IP
Address as well. Will it be more efficient to also add index on IP
Address to speedup lookups by IP?

-- 
Regards,

Maksim

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] index

2009-01-13 Thread David Wilson
On Wed, Jan 14, 2009 at 12:53 AM, Maksim Sosnovskiy msosn...@gmail.com wrote:
Will it be more efficient to also add index on IP
 Address to speedup lookups by IP?

Most likely, especially if the IP address is not the first column in
your primary key index.

Have you done an explain analyze of your ip lookup query? If not, do
so; that can help. Then try creating the index and explain analyze the
query again to see what happens.

Knowing your existing schema/indices and such would let us do more
than guess- and not knowing the plan your current query is using makes
it difficult to know if there's a better one using a to-be-created
index.
-- 
- David T. Wilson
david.t.wil...@gmail.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance