Re: [PERFORM] Improving performance on system catalog
I would like to speed up this query: Total runtime: 13.844 ms Why bother? It's running in less than 14 milliseconds. -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PERFORM] Query using SeqScan instead of IndexScan
On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote: > Hi Jim, > > I'm not quite sure what you mean by the correlation of category_id? It means how many distinct values does it have (at least that's my understanding of it ;) ). select category_id, count(*) from category_product group by category_id; will show you how many category_id's there are and how many products are in each category. Having a lot of products in one category (or having a small amount of categories) can slow things down because the db can't use the index effectively.. which might be what you're seeing (hence why it's fast for some categories, slow for others). > On Mar 31, 2006, at 8:59 AM, Jim C. Nasby wrote: > > > What's the correlation of category_id? The current index scan cost > > estimator places a heavy penalty on anything with a correlation much > > below about 90%. > > > > On Wed, Mar 29, 2006 at 08:12:28PM -0700, Brendan Duddridge wrote: > >> Hi, > >> > >> I have a query that is using a sequential scan instead of an index > >> scan. I've turned off sequential scans and it is in fact faster with > >> the index scan. > >> > >> Here's my before and after. > >> > >> Before: > >> > >> ssdev=# SET enable_seqscan TO DEFAULT; > >> ssdev=# explain analyze select cp.product_id > >> from category_product cp, product_attribute_value pav > >> where cp.category_id = 1001082 and cp.product_id = > >> pav.product_id; > >> > >> > >>QUERY PLAN > >> - > >> --- > >> - > >> --- > >> -- > >> Hash Join (cost=25.52..52140.59 rows=5139 width=4) (actual > >> time=4.521..2580.520 rows=19695 loops=1) > >>Hash Cond: ("outer".product_id = "inner".product_id) > >>-> Seq Scan on product_attribute_value pav (cost=0.00..40127.12 > >> rows=2387312 width=4) (actual time=0.039..1469.295 rows=2385846 > >> loops=1) > >>-> Hash (cost=23.10..23.10 rows=970 width=4) (actual > >> time=2.267..2.267 rows=1140 loops=1) > >> -> Index Scan using x_category_product__category_id_fk_idx > >> on category_product cp (cost=0.00..23.10 rows=970 width=4) (actual > >> time=0.122..1.395 rows=1140 loops=1) > >>Index Cond: (category_id = 1001082) > >> Total runtime: 2584.221 ms > >> (7 rows) > >> > >> > >> After: > >> > >> ssdev=# SET enable_seqscan TO false; > >> ssdev=# explain analyze select cp.product_id > >> from category_product cp, product_attribute_value pav > >> where cp.category_id = 1001082 and cp.product_id = > >> pav.product_id; > >> > >> > >> QUERY PLAN > >> - > >> --- > >> - > >> --- > >> - > >> Nested Loop (cost=0.00..157425.22 rows=5139 width=4) (actual > >> time=0.373..71.177 rows=19695 loops=1) > >>-> Index Scan using x_category_product__category_id_fk_idx on > >> category_product cp (cost=0.00..23.10 rows=970 width=4) (actual > >> time=0.129..1.438 rows=1140 loops=1) > >> Index Cond: (category_id = 1001082) > >>-> Index Scan using product_attribute_value__product_id_fk_idx > >> on product_attribute_value pav (cost=0.00..161.51 rows=61 width=4) > >> (actual time=0.016..0.053 rows=17 loops=1140) > >> Index Cond: ("outer".product_id = pav.product_id) > >> Total runtime: 74.747 ms > >> (6 rows) > >> > >> There's quite a big difference in speed there. 2584.221 ms vs. 74.747 > >> ms. > >> > >> Any ideas what I can do to improve this without turning sequential > >> scanning off? > >> > >> Thanks, > >> > >> > >> Brendan Duddridge | CTO | 403-277-5591 x24 | [EMAIL PROTECTED] > >> > >> ClickSpace Interactive Inc. > >> Suite L100, 239 - 10th Ave. SE > >> Calgary, AB T2G 0V9 > >> > >> http://www.clickspace.com > >> > > > > > > > > -- > > Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] > > Pervasive Software http://pervasive.comwork: 512-231-6117 > > vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 > > > > ---(end of > > broadcast)--- > > TIP 1: if posting/reading through Usenet, please send an appropriate > >subscribe-nomail command to [EMAIL PROTECTED] so that > > your > >message can get through to the mailing list cleanly > > > > > > -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Query using SeqScan instead of IndexScan
On 4/2/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > On Sat, Apr 01, 2006 at 11:23:37AM +1000, chris smith wrote: > > On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote: > > > Hi Jim, > > > > > > I'm not quite sure what you mean by the correlation of category_id? > > > > It means how many distinct values does it have (at least that's my > > understanding of it ;) ). > > Your understanding is wrong. :) What you're discussing is n_distinct. Geez, I'm going well this week ;) Thanks for the detailed info. -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Query using SeqScan instead of IndexScan
On 4/2/06, chris smith <[EMAIL PROTECTED]> wrote: > On 4/2/06, Jim C. Nasby <[EMAIL PROTECTED]> wrote: > > On Sat, Apr 01, 2006 at 11:23:37AM +1000, chris smith wrote: > > > On 4/1/06, Brendan Duddridge <[EMAIL PROTECTED]> wrote: > > > > Hi Jim, > > > > > > > > I'm not quite sure what you mean by the correlation of category_id? > > > > > > It means how many distinct values does it have (at least that's my > > > understanding of it ;) ). > > > > Your understanding is wrong. :) What you're discussing is n_distinct. It'd be nice if the database developers agreed on what terms meant. http://dev.mysql.com/doc/refman/5.1/en/myisam-index-statistics.html The SHOW INDEX statement displays a cardinality value based on N/S, where N is the number of rows in the table and S is the average value group size. That ratio yields an approximate number of value groups in the table. A work colleague found that information a few weeks ago so that's where my misunderstanding came from - if I'm reading that right they use n_distinct as their "cardinality" basis.. then again I could be reading that completely wrong too. I believe postgres (because it's a lot more standards compliant).. but sheesh - what a difference! This week's task - stop reading mysql documentation. -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Slow query - possible bug?
On 4/13/06, Gavin Hamill <[EMAIL PROTECTED]> wrote: > laterooms=# explain analyze select allocation0_."ID" as y1_, > allocation0_."RoomID" as y2_, allocation0_."StatusID" as y4_, > allocation0_."Price" as y3_, allocation0_."Number" as y5_, > allocation0_."Date" as y6_ from "Allocation" allocation0_ where > (allocation0_."Date" between '2006-06-09 00:00:00.00' and > '2006-06-09 00:00:00.00')and(allocation0_."RoomID" in(4300591)); > QUERY PLAN > -- > Index Scan using ix_date on "Allocation" allocation0_ (cost=0.00..4.77 > rows=1 width=34) (actual time=1411.325..1689.860 rows=1 loops=1) >Index Cond: (("Date" >= '2006-06-09'::date) AND ("Date" <= > '2006-06-09'::date)) >Filter: ("RoomID" = 4300591) > Total runtime: 1689.917 ms > (4 rows) 1.6secs isn't too bad on 4.3mill rows... How many entries are there for that date range? -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Query on postgresql 7.4.2 not using index
On 4/25/06, Arnau <[EMAIL PROTECTED]> wrote: > Hi all, > >I have the following running on postgresql version 7.4.2: > > CREATE SEQUENCE agenda_user_group_id_seq > MINVALUE 1 > MAXVALUE 9223372036854775807 > CYCLE > INCREMENT 1 > START 1; > > CREATE TABLE AGENDA_USERS_GROUPS > ( > AGENDA_USER_GROUP_ID INT8 > CONSTRAINT pk_agndusrgrp_usergroup PRIMARY KEY > DEFAULT NEXTVAL('agenda_user_group_id_seq'), > USER_ID NUMERIC(10) > CONSTRAINT fk_agenda_uid REFERENCES > AGENDA_USERS (USER_ID) > ON DELETE CASCADE > NOT NULL, > GROUP_ID NUMERIC(10) > CONSTRAINT fk_agenda_gid REFERENCES > AGENDA_GROUPS (GROUP_ID) > ON DELETE CASCADE > NOT NULL, > CREATION_DATE DATE > DEFAULT CURRENT_DATE, > CONSTRAINT un_agndusrgrp_usergroup > UNIQUE(USER_ID, GROUP_ID) > ); > > CREATE INDEX i_agnusrsgrs_userid ON AGENDA_USERS_GROUPS ( USER_ID ); > CREATE INDEX i_agnusrsgrs_groupid ON AGENDA_USERS_GROUPS ( GROUP_ID ); > > > When I execute: > > EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups > WHERE group_id = 9; Try EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id::int8 = 9; or EXPLAIN ANALYZE SELECT agenda_user_group_id FROM agenda_users_groups WHERE group_id = '9'; and let us know what happens. -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Query on postgresql 7.4.2 not using index
> OK. Stop and think about what you're telling postgresql to do here. > > You're telling it to cast the field group_id to int8, then compare it to > 9. How can it cast the group_id to int8 without fetching it? That's > right, you're ensuring a seq scan. You need to put the int8 cast on the > other side of that equality comparison, like: Yeh that one was my fault :) I couldn't remember which way it went and if 7.4.x had issues with int8 indexes.. -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Worsening performance with 7.4 on flash-based system
On 4/29/06, Greg Stumph <[EMAIL PROTECTED]> wrote: Well, since I got no response at all to this message, I can only assume that I've asked the question in an insufficient way, or else that no one has anything to offer on our problem. This was my first post to the list, so if there's a better way I should be asking this, or different data I should provide, hopefully someone will let me know... Thanks, Greg "Greg Stumph" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > We are experiencing gradually worsening performance in PostgreSQL 7.4.7, > on a system with the following specs: > Linux OS (Fedora Core 1, 2.4 kernal) > Flash file system (2 Gig, about 80% full) > 256 Meg RAM > 566 MHz Celeron CPU > > We use Orbit 2.9.8 to access PostGres. The database contains 62 tables. > > When the system is running with a fresh copy of the database, performance > is fine. At its worst, we are seeing fairly simple SELECT queries taking > up to 1 second to execute. When these queries are run in a loop, the loop > can take up to 30 seconds to execute, instead of the 2 seconds or so that > we would expect. If you're inserting/updating/deleting a table or tables heavily, then you'll need to vacuum it a lot more often than a reasonably static table. Are you running contrib/autovacuum at all? PG 8.0 and above have autovacuum built in but 7.4.x needs to run the contrib version. PS - the latest 7.4 version is .12 - see http://www.postgresql.org/docs/7.4/interactive/release.html for what has changed (won't be much in performance terms but may fix data-loss bugs). -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] select with "like" from another table
On 5/29/06, Anton Maksimenkov <[EMAIL PROTECTED]> wrote: Hi. I have 2 tables - one with calls numbers and another with calls codes. The structure almost like this: ... How long does this query take? SELECT code FROM a_voip_codes c, a_voip v where v.called_station_id like c.code || '%' order by code desc limit 1 I wonder if you'll benefit from an index on a_voip(called_station_id) to speed up this join. -- Postgresql & php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] simple query join
Title: Message Hi all, I've got what should be a relatively simple join between two tables that is taking forever and I can't work out why. Version 7.3.4RH. It can't be upgraded because the system is kept in sync with RedHat Enterprise (using up2date). Not my system otherwise I'd do that :( Database has been 'vacuum analyze'd. blah=> \d sq_asset; Table "public.sq_asset" Column | Type | Modifiers +-+-- type_code | character varying(100) | not null version | character varying(20) | not null default '0.0.0' name | character varying(255) | not null default '' short_name | character varying(255) | not null default '' status | integer | not null default 1 languages | character varying(50) | not null default '' charset | character varying(50) | not null default '' force_secure | character(1) | not null default '0' created | timestamp without time zone | not null updated | timestamp without time zone | not null created_userid | character varying(255) | not null default '0' updated_userid | character varying(255) | not null default '0' assetid | integer | not null default 0Indexes: sq_asset_pkey primary key btree (assetid) blah=> select count(*) from sq_asset; count --- 16467(1 row) blah=> \d sq_asset_permission; Table "public.sq_asset_permission" Column | Type | Modifiers ++-- permission | integer | not null default 0 access | character(1) | not null default '0' assetid | character varying(255) | not null default '0' userid | character varying(255) | not null default '0'Indexes: sq_asset_permission_pkey primary key btree (assetid, userid, permission) "sq_asset_permission_access" btree ("access") "sq_asset_permission_assetid" btree (assetid) "sq_asset_permission_permission" btree (permission) "sq_asset_permission_userid" btree (userid) blah=> select count(*) from sq_asset_permission; count --- 73715(1 row) EXPLAIN ANALYZE SELECT p.*FROM sq_asset a, sq_asset_permission pWHERE a.assetid = p.assetidAND p.permission = '1'AND p.access = '1'AND p.userid = '0'; QUERY PLAN Nested Loop (cost=0.00..4743553.10 rows=2582 width=27) (actual time=237.91..759310.60 rows=11393 loops=1) Join Filter: (("inner".assetid)::text = ("outer".assetid)::text) -> Seq Scan on sq_asset_permission p (cost=0.00..1852.01 rows=2288 width=23) (actual time=0.06..196.90 rows=12873 loops=1) Filter: ((permission = 1) AND ("access" = '1'::bpchar) AND (userid = '0'::character varying)) -> Seq Scan on sq_asset a (cost=0.00..1825.67 rows=16467 width=4) (actual time=1.40..29.09 rows=16467 loops=12873) Total runtime: 759331.85 msec(6 rows) It's a straight join so I can't see why it would be this slow.. The tables are pretty small too. Thanks for any suggestions :) Chris.
Re: [PERFORM] simple query join
Title: Message Eek. Casting both to varchar makes it super quick so I'll fix up the tables. Added to the list of things to check for next time... On a side note - I tried it with 7.4.1 on another box and it handled it ok. Thanks again :) Chris. -Original Message-From: Steven Butler [mailto:[EMAIL PROTECTED] Sent: Monday, March 08, 2004 6:12 PMTo: Chris Smith; [EMAIL PROTECTED]Subject: Re: [PERFORM] simple query join Looks to me like it's because your assetid is varchar in one table and an integer in the other table. AFAIK, PG is unable to use an index join when the join types are different. The query plan shows it is doing full table scans of both tables. Change both to varchar or both to integer and see what happens. Also make sure to vacuum analyze the tables regularly to keep the query planner statistics up-to-date. Cheers, Steve Butler assetid | integer | not null default 0Indexes: sq_asset_pkey primary key btree (assetid) assetid | character varying(255) | not null default '0'EXPLAIN ANALYZE SELECT p.*FROM sq_asset a, sq_asset_permission pWHERE a.assetid = p.assetidAND p.permission = '1'AND p.access = '1'AND p.userid = '0'; QUERY PLAN Nested Loop (cost=0.00..4743553.10 rows=2582 width=27) (actual time=237.91..759310.60 rows=11393 loops=1) Join Filter: (("inner".assetid)::text = ("outer".assetid)::text) -> Seq Scan on sq_asset_permission p (cost=0.00..1852.01 rows=2288 width=23) (actual time=0.06..196.90 rows=12873 loops=1) Filter: ((permission = 1) AND ("access" = '1'::bpchar) AND (userid = '0'::character varying)) -> Seq Scan on sq_asset a (cost=0.00..1825.67 rows=16467 width=4) (actual time=1.40..29.09 rows=16467 loops=12873) Total runtime: 759331.85 msec(6 rows)
Re: [PERFORM] [ADMIN] syslog slowing the database?
It might depend on how you're rotating it. Try the copy/truncate method instead of moving the log file. If you move the log file to another filename you usually have to restart the app doing the logging before it starts logging again. Chris. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christopher Kings-Lynne Sent: Thursday, March 11, 2004 12:35 PM To: Tom Lane Cc: Greg Spiegelberg; PgSQL Performance ML; Postgres Admin List Subject: Re: [PERFORM] [ADMIN] syslog slowing the database? > You could also consider not using syslog at all: let the postmaster > output to its stderr, and pipe that into a log-rotation program. I > believe some people use Apache's log rotator for this with good > results. Not an option I'm afraid. PostgreSQL just jams and stops logging after the first rotation... I've read in the docs that syslog logging is the only "production" solution... Chris ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match