Re: [PERFORM] Improving performance on system catalog

2007-03-28 Thread chris smith

I would like to speed up this query:





 Total runtime: 13.844 ms


Why bother?

It's running in less than 14 milliseconds.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-03-31 Thread chris smith
On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote:
> Hi Jim,
>
> I'm not quite sure what you mean by the correlation of category_id?

It means how many distinct values does it have (at least that's my
understanding of it ;) ).

select category_id, count(*) from category_product group by category_id;

will show you how many category_id's there are and how many products
are in each category.

Having a lot of products in one category (or having a small amount of
categories) can slow things down because the db can't use the index
effectively.. which might be what you're seeing (hence why it's fast
for some categories, slow for others).


> On Mar 31, 2006, at 8:59 AM, Jim C. Nasby wrote:
>
> > What's the correlation of category_id? The current index scan cost
> > estimator places a heavy penalty on anything with a correlation much
> > below about 90%.
> >
> > On Wed, Mar 29, 2006 at 08:12:28PM -0700, Brendan Duddridge wrote:
> >> Hi,
> >>
> >> I have a query that is using a sequential scan instead of an index
> >> scan. I've turned off sequential scans and it is in fact faster with
> >> the index scan.
> >>
> >> Here's my before and after.
> >>
> >> Before:
> >>
> >> ssdev=# SET enable_seqscan TO DEFAULT;
> >> ssdev=# explain analyze select cp.product_id
> >>  from category_product cp, product_attribute_value pav
> >>  where cp.category_id = 1001082 and cp.product_id =
> >>  pav.product_id;
> >>
> >>
> >>QUERY PLAN
> >> -
> >> ---
> >> -
> >> ---
> >> --
> >> Hash Join  (cost=25.52..52140.59 rows=5139 width=4) (actual
> >> time=4.521..2580.520 rows=19695 loops=1)
> >>Hash Cond: ("outer".product_id = "inner".product_id)
> >>->  Seq Scan on product_attribute_value pav  (cost=0.00..40127.12
> >> rows=2387312 width=4) (actual time=0.039..1469.295 rows=2385846
> >> loops=1)
> >>->  Hash  (cost=23.10..23.10 rows=970 width=4) (actual
> >> time=2.267..2.267 rows=1140 loops=1)
> >>  ->  Index Scan using x_category_product__category_id_fk_idx
> >> on category_product cp  (cost=0.00..23.10 rows=970 width=4) (actual
> >> time=0.122..1.395 rows=1140 loops=1)
> >>Index Cond: (category_id = 1001082)
> >> Total runtime: 2584.221 ms
> >> (7 rows)
> >>
> >>
> >> After:
> >>
> >> ssdev=# SET enable_seqscan TO false;
> >> ssdev=# explain analyze select cp.product_id
> >>  from category_product cp, product_attribute_value pav
> >>  where cp.category_id = 1001082 and cp.product_id =
> >>  pav.product_id;
> >>
> >>
> >>   QUERY PLAN
> >> -
> >> ---
> >> -
> >> ---
> >> -
> >> Nested Loop  (cost=0.00..157425.22 rows=5139 width=4) (actual
> >> time=0.373..71.177 rows=19695 loops=1)
> >>->  Index Scan using x_category_product__category_id_fk_idx on
> >> category_product cp  (cost=0.00..23.10 rows=970 width=4) (actual
> >> time=0.129..1.438 rows=1140 loops=1)
> >>  Index Cond: (category_id = 1001082)
> >>->  Index Scan using product_attribute_value__product_id_fk_idx
> >> on product_attribute_value pav  (cost=0.00..161.51 rows=61 width=4)
> >> (actual time=0.016..0.053 rows=17 loops=1140)
> >>  Index Cond: ("outer".product_id = pav.product_id)
> >> Total runtime: 74.747 ms
> >> (6 rows)
> >>
> >> There's quite a big difference in speed there. 2584.221 ms vs. 74.747
> >> ms.
> >>
> >> Any ideas what I can do to improve this without turning sequential
> >> scanning off?
> >>
> >> Thanks,
> >>
> >> 
> >> Brendan Duddridge | CTO | 403-277-5591 x24 |  [EMAIL PROTECTED]
> >>
> >> ClickSpace Interactive Inc.
> >> Suite L100, 239 - 10th Ave. SE
> >> Calgary, AB  T2G 0V9
> >>
> >> http://www.clickspace.com
> >>
> >
> >
> >
> > --
> > Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
> > Pervasive Software  http://pervasive.comwork: 512-231-6117
> > vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
> >
> > ---(end of
> > broadcast)---
> > TIP 1: if posting/reading through Usenet, please send an appropriate
> >subscribe-nomail command to [EMAIL PROTECTED] so that
> > your
> >message can get through to the mailing list cleanly
> >
>
>
>
>


--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread chris smith
On 4/2/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
> On Sat, Apr 01, 2006 at 11:23:37AM +1000, chris smith wrote:
> > On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote:
> > > Hi Jim,
> > >
> > > I'm not quite sure what you mean by the correlation of category_id?
> >
> > It means how many distinct values does it have (at least that's my
> > understanding of it ;) ).
>
> Your understanding is wrong. :) What you're discussing is n_distinct.

Geez, I'm going well this week ;)

Thanks for the detailed info.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Query using SeqScan instead of IndexScan

2006-04-01 Thread chris smith
On 4/2/06, chris smith <[EMAIL PROTECTED]> wrote:
> On 4/2/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote:
> > On Sat, Apr 01, 2006 at 11:23:37AM +1000, chris smith wrote:
> > > On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote:
> > > > Hi Jim,
> > > >
> > > > I'm not quite sure what you mean by the correlation of category_id?
> > >
> > > It means how many distinct values does it have (at least that's my
> > > understanding of it ;) ).
> >
> > Your understanding is wrong. :) What you're discussing is n_distinct.


It'd be nice if the database developers agreed on what terms meant.

http://dev.mysql.com/doc/refman/5.1/en/myisam-index-statistics.html

The SHOW INDEX statement displays a cardinality value based on N/S,
where N is the number of rows in the table and S is the average value
group size. That ratio yields an approximate number of value groups in
the table.


A work colleague found that information a few weeks ago so that's
where my misunderstanding came from - if I'm reading that right they
use n_distinct as their "cardinality" basis.. then again I could be
reading that completely wrong too.

I believe postgres (because it's a lot more standards compliant).. but
sheesh - what a difference!

This week's task - stop reading mysql documentation.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Slow query - possible bug?

2006-04-13 Thread chris smith
On 4/13/06, Gavin Hamill <[EMAIL PROTECTED]> wrote:
> laterooms=# explain analyze select allocation0_."ID" as y1_,
> allocation0_."RoomID" as y2_, allocation0_."StatusID" as y4_,
> allocation0_."Price" as y3_, allocation0_."Number" as y5_,
> allocation0_."Date" as y6_ from "Allocation" allocation0_ where
> (allocation0_."Date" between '2006-06-09 00:00:00.00' and
> '2006-06-09 00:00:00.00')and(allocation0_."RoomID" in(4300591));
> QUERY PLAN
> --
>  Index Scan using ix_date on "Allocation" allocation0_  (cost=0.00..4.77
> rows=1 width=34) (actual time=1411.325..1689.860 rows=1 loops=1)
>Index Cond: (("Date" >= '2006-06-09'::date) AND ("Date" <=
> '2006-06-09'::date))
>Filter: ("RoomID" = 4300591)
>  Total runtime: 1689.917 ms
> (4 rows)

1.6secs isn't too bad on 4.3mill rows...

How many entries are there for that date range?

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Query on postgresql 7.4.2 not using index

2006-04-25 Thread chris smith
On 4/25/06, Arnau <[EMAIL PROTECTED]> wrote:
> Hi all,
>
>I have the following running on postgresql version 7.4.2:
>
> CREATE SEQUENCE agenda_user_group_id_seq
> MINVALUE 1
> MAXVALUE 9223372036854775807
> CYCLE
> INCREMENT 1
> START 1;
>
> CREATE TABLE AGENDA_USERS_GROUPS
> (
>   AGENDA_USER_GROUP_ID  INT8
> CONSTRAINT pk_agndusrgrp_usergroup PRIMARY KEY
> DEFAULT NEXTVAL('agenda_user_group_id_seq'),
>   USER_ID   NUMERIC(10)
> CONSTRAINT fk_agenda_uid  REFERENCES
> AGENDA_USERS (USER_ID)
> ON DELETE CASCADE
> NOT NULL,
>   GROUP_ID  NUMERIC(10)
> CONSTRAINT fk_agenda_gid  REFERENCES
> AGENDA_GROUPS (GROUP_ID)
> ON DELETE CASCADE
> NOT NULL,
>   CREATION_DATE DATE
> DEFAULT CURRENT_DATE,
> CONSTRAINT un_agndusrgrp_usergroup
> UNIQUE(USER_ID, GROUP_ID)
> );
>
> CREATE INDEX i_agnusrsgrs_userid ON AGENDA_USERS_GROUPS ( USER_ID );
> CREATE INDEX i_agnusrsgrs_groupid ON AGENDA_USERS_GROUPS ( GROUP_ID );
>
>
> When I execute:
>
> EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups
> WHERE group_id = 9;

Try

EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups
WHERE group_id::int8 = 9;

or

EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups
WHERE group_id = '9';

and let us know what happens.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Query on postgresql 7.4.2 not using index

2006-04-25 Thread chris smith
> OK.  Stop and think about what you're telling postgresql to do here.
>
> You're telling it to cast the field group_id to int8, then compare it to
> 9.  How can it cast the group_id to int8 without fetching it?  That's
> right, you're ensuring a seq scan.  You need to put the int8 cast on the
> other side of that equality comparison, like:

Yeh that one was my fault :) I couldn't remember which way it went and
if 7.4.x had issues with int8 indexes..

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Worsening performance with 7.4 on flash-based system

2006-04-29 Thread chris smith

On 4/29/06, Greg Stumph <[EMAIL PROTECTED]> wrote:

Well, since I got no response at all to this message, I can only assume that
I've asked the question in an insufficient way, or else that no one has
anything to offer on our problem.

This was my first post to the list, so if there's a better way I should be
asking this, or different data I should provide, hopefully someone will let
me know...

Thanks,
Greg

"Greg Stumph" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> We are experiencing gradually worsening performance in PostgreSQL 7.4.7,
> on a system with the following specs:
> Linux OS (Fedora Core 1, 2.4 kernal)
> Flash file system (2 Gig, about 80% full)
> 256 Meg RAM
> 566 MHz Celeron CPU
>
> We use Orbit 2.9.8 to access PostGres. The database contains 62 tables.
>
> When the system is running with a fresh copy of the database, performance
> is fine. At its worst, we are seeing fairly simple SELECT queries taking
> up to 1 second to execute. When these queries are run in a loop, the loop
> can take up to 30 seconds to execute, instead of the 2 seconds or so that
> we would expect.


If you're inserting/updating/deleting a table or tables heavily, then
you'll need to vacuum it a lot more often than a reasonably static
table. Are you running contrib/autovacuum at all? PG 8.0 and above
have autovacuum built in but 7.4.x needs to run the contrib version.

PS - the latest 7.4 version is .12 - see
http://www.postgresql.org/docs/7.4/interactive/release.html for what
has changed (won't be much in performance terms but may fix data-loss
bugs).

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] select with "like" from another table

2006-05-29 Thread chris smith

On 5/29/06, Anton Maksimenkov <[EMAIL PROTECTED]> wrote:

Hi.

I have 2 tables - one with calls numbers and another with calls codes.
The structure almost like this:

...

How long does this query take?

SELECT code FROM a_voip_codes c, a_voip v where v.called_station_id
like c.code ||
'%' order by code desc limit 1

I wonder if you'll benefit from an index on a_voip(called_station_id)
to speed up this join.

--
Postgresql & php tutorials
http://www.designmagick.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[PERFORM] simple query join

2004-03-07 Thread Chris Smith
Title: Message



Hi 
all,
 
I've got what 
should be a relatively simple join between two tables that is taking forever and 
I can't work out why.
 
Version 
7.3.4RH.
 
It can't be 
upgraded because the system is kept in sync with RedHat Enterprise (using 
up2date). Not my system otherwise I'd do that :(
 
Database has been 
'vacuum analyze'd.
 
blah=> \d 
sq_asset; 
Table "public.sq_asset" 
Column 
|    
Type 
|    
Modifiers 
+-+-- type_code  
| character varying(100)  | not 
null version    | character 
varying(20)   | not null default 
'0.0.0' name   
| character varying(255)  | not null default 
'' short_name | character 
varying(255)  | not null default 
'' status | 
integer 
| not null default 1 languages  | 
character varying(50)   | not null default 
'' charset    | character 
varying(50)   | not null default 
'' force_secure   | 
character(1)    
| not null default 
'0' created    | timestamp 
without time zone | not 
null updated    | timestamp 
without time zone | not null created_userid | character 
varying(255)  | not null default 
'0' updated_userid | character 
varying(255)  | not null default 
'0' assetid    | 
integer 
| not null default 0Indexes: sq_asset_pkey primary key btree 
(assetid)

blah=> select 
count(*) from sq_asset; count --- 16467(1 
row)
 
 
blah=> \d 
sq_asset_permission; 
Table "public.sq_asset_permission"   Column   
|  
Type  
|  Modifiers   
++-- permission 
| 
integer    
| not null default 0 access | 
character(1)   | not 
null default '0' assetid    | character varying(255) | 
not null default '0' userid | character 
varying(255) | not null default '0'Indexes: sq_asset_permission_pkey primary 
key btree (assetid, userid, permission)    
"sq_asset_permission_access" btree ("access")    
"sq_asset_permission_assetid" btree (assetid)    
"sq_asset_permission_permission" btree (permission)    
"sq_asset_permission_userid" btree (userid)
blah=> select 
count(*) from sq_asset_permission; count 
--- 73715(1 row)
 
EXPLAIN ANALYZE 
SELECT p.*FROM sq_asset a, sq_asset_permission pWHERE a.assetid = 
p.assetidAND p.permission = '1'AND p.access = '1'AND p.userid = 
'0';   
QUERY 
PLAN Nested 
Loop  (cost=0.00..4743553.10 rows=2582 width=27) (actual 
time=237.91..759310.60 rows=11393 loops=1)   Join Filter: 
(("inner".assetid)::text = ("outer".assetid)::text)   ->  
Seq Scan on sq_asset_permission p  (cost=0.00..1852.01 rows=2288 width=23) 
(actual time=0.06..196.90 rows=12873 
loops=1) Filter: 
((permission = 1) AND ("access" = '1'::bpchar) AND (userid = '0'::character 
varying))   ->  Seq Scan on sq_asset a  
(cost=0.00..1825.67 rows=16467 width=4) (actual time=1.40..29.09 rows=16467 
loops=12873) Total runtime: 759331.85 msec(6 
rows)
 
It's a straight 
join so I can't see why it would be this slow.. The tables are pretty small 
too.
 
Thanks for any 
suggestions :)
 
Chris.
 


Re: [PERFORM] simple query join

2004-03-08 Thread Chris Smith
Title: Message



Eek. 
Casting both to varchar makes it super quick so I'll 
fix up the tables.
 
Added to the list of things to check for next 
time...
 
On a 
side note - I tried it with 7.4.1 on another box and it handled it 
ok.
 
Thanks again :)
 
Chris.
 

  
  -Original Message-From: Steven Butler 
  [mailto:[EMAIL PROTECTED] Sent: Monday, March 08, 2004 6:12 
  PMTo: Chris Smith; 
  [EMAIL PROTECTED]Subject: Re: [PERFORM] simple query 
  join
  Looks to me like it's because your assetid is 
  varchar in one table and an integer in the other table.  AFAIK, PG is 
  unable to use an index join when the join types are different.  The query 
  plan shows it is doing full table scans of both tables.
   
  Change both to varchar or both to integer and see 
  what happens.
   
  Also make sure to vacuum analyze the tables 
  regularly to keep the query planner statistics up-to-date.
   
  Cheers,
  Steve Butler
  
 assetid    
| 
integer 
| not null default 0Indexes: sq_asset_pkey primary key btree 
(assetid)

 assetid    | character varying(255) | not null 
default '0'EXPLAIN ANALYZE SELECT p.*FROM sq_asset a, 
sq_asset_permission pWHERE a.assetid = p.assetidAND p.permission = 
'1'AND p.access = '1'AND p.userid = 
'0';   
QUERY 
PLAN Nested 
Loop  (cost=0.00..4743553.10 rows=2582 width=27) (actual 
time=237.91..759310.60 rows=11393 loops=1)   Join Filter: 
(("inner".assetid)::text = ("outer".assetid)::text)   
->  Seq Scan on sq_asset_permission p  (cost=0.00..1852.01 
rows=2288 width=23) (actual time=0.06..196.90 rows=12873 
loops=1) Filter: 
((permission = 1) AND ("access" = '1'::bpchar) AND (userid = '0'::character 
varying))   ->  Seq Scan on sq_asset a  
(cost=0.00..1825.67 rows=16467 width=4) (actual time=1.40..29.09 rows=16467 
loops=12873) Total runtime: 759331.85 msec(6 
rows)


Re: [PERFORM] [ADMIN] syslog slowing the database?

2004-03-10 Thread Chris Smith
It might depend on how you're rotating it.

Try the copy/truncate method instead of moving the log file. If you move
the log file to another filename you usually have to restart the app
doing the logging before it starts logging again.

Chris.

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Christopher
Kings-Lynne
Sent: Thursday, March 11, 2004 12:35 PM
To: Tom Lane
Cc: Greg Spiegelberg; PgSQL Performance ML; Postgres Admin List
Subject: Re: [PERFORM] [ADMIN] syslog slowing the database?


> You could also consider not using syslog at all: let the postmaster 
> output to its stderr, and pipe that into a log-rotation program. I 
> believe some people use Apache's log rotator for this with good 
> results.

Not an option I'm afraid.  PostgreSQL just jams and stops logging after 
the first rotation...

I've read in the docs that syslog logging is the only "production" 
solution...

Chris


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match