Re: [PERFORM] Speeding up select distinct
Laurent Martelli <[EMAIL PROTECTED]> writes: > PFC>SELECT owner from pictures group by owner; > > That's a slight improvement, but there's still a seq scan on pictures: It should be a sequential scan. An index will be slower. > HashAggregate (cost=114.38..114.38 rows=21 width=4) (actual > time=7.585..7.605 rows=21 loops=1) >-> Seq Scan on pictures (cost=0.00..103.70 rows=4270 width=4) (actual > time=0.015..3.272 rows=4270 loops=1) > Total runtime: 7.719 ms That's the best plan for this query. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Speeding up select distinct
> I just wished there was a means to fully automate all this and render > it transparent to the user, just like an index. > > Merlin> Voila! Merlin p.s. normalize your data always! > > I have this: > > pictures( > PictureID serial PRIMARY KEY, > Owner integer NOT NULL REFERENCES users, > [...]); > CREATE TABLE users ( > UserID serial PRIMARY KEY, > Name character varying(255), > [...]); > > Isn't it normalized ? try: select * from users where UserID in (select pictureId from pictures); select * userid from users intersect select pictureid from pictures; select distinct userid, [...] from users, pictures where user userid = pictureid) if none of these give you what you want then you can solve this with a new tble, picture_user using the instructions I gave previously. Not sure if your data is normalized, but ISTM you are over-using surrogate keys. It may not be possible, but consider downgrading ID columns to unique and picking a natural key. Now you get better benefit of RI and you can sometimes remove joins from certain queries. Rule: use natural keys when you can, surrogate keys when you have to. Corollary: use domains for fields used in referential integrity. Merlin ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Speeding up select distinct
On Wed, 2005-03-16 at 19:31 +0100, Laurent Martelli wrote: > > "Rod" == Rod Taylor <[EMAIL PROTECTED]> writes: > > Rod> On Wed, 2005-03-16 at 18:58 +0100, Laurent Martelli wrote: > >> Consider this query: > >> > >> SELECT distinct owner from pictures; > > Rod> The performance has nothing to do with the number of rows > Rod> returned, but rather the complexity of calculations and amount > Rod> of data to sift through in order to find it. > > Yes, but I thought that an index might be able to know what distinct > values there are and help optime that query very much. The index does know. You just have to visit all of the pages within the index to find out, which it does, and that's why you dropped 10ms. But if you want a sub ms query, you're going to have to normalize the structure. -- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Speeding up select distinct
> "Merlin" == Merlin Moncure <[EMAIL PROTECTED]> writes: >> Consider this query: >> >> SELECT distinct owner from pictures; Merlin> [...] >> Any ideas, apart from more or less manually maintaining a list of >> distinct owners in another table ? Merlin> you answered your own question. With a 20 row owners table, Merlin> you should be directing your efforts there group by is Merlin> faster than distinct, but both are very wasteful and Merlin> essentially require s full seqscan of the detail table. Merlin> With a little hacking, you can change 'manual maintenance' Merlin> to 'automatic maintenance'. Merlin> 1. create table owner as select distinct owner from Merlin> pictures; 2. alter table owner add constraint Merlin> owner_pkey(owner); 3. alter table pictures add constraint Merlin> ri_picture_owner(owner) references owner; 4. make a little Merlin> append_ownder function which adds an owner to the owner Merlin> table if there is not already one there. Inline this to your Merlin> insert statement on pictures. I just wished there was a means to fully automate all this and render it transparent to the user, just like an index. Merlin> Voila! Merlin p.s. normalize your data always! I have this: pictures( PictureID serial PRIMARY KEY, Owner integer NOT NULL REFERENCES users, [...]); CREATE TABLE users ( UserID serial PRIMARY KEY, Name character varying(255), [...]); Isn't it normalized ? -- Laurent Martelli [EMAIL PROTECTED]Java Aspect Components http://www.aopsys.com/ http://jac.objectweb.org ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Speeding up select distinct
Wow, what a fast response !!! > "PFC" == PFC <[EMAIL PROTECTED]> writes: PFC> Try : PFC> SELECT owner from pictures group by owner; That's a slight improvement, but there's still a seq scan on pictures: HashAggregate (cost=114.38..114.38 rows=21 width=4) (actual time=7.585..7.605 rows=21 loops=1) -> Seq Scan on pictures (cost=0.00..103.70 rows=4270 width=4) (actual time=0.015..3.272 rows=4270 loops=1) Total runtime: 7.719 ms -- Laurent Martelli [EMAIL PROTECTED]Java Aspect Components http://www.aopsys.com/ http://jac.objectweb.org ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Speeding up select distinct
> "Rod" == Rod Taylor <[EMAIL PROTECTED]> writes: Rod> On Wed, 2005-03-16 at 18:58 +0100, Laurent Martelli wrote: >> Consider this query: >> >> SELECT distinct owner from pictures; Rod> The performance has nothing to do with the number of rows Rod> returned, but rather the complexity of calculations and amount Rod> of data to sift through in order to find it. Yes, but I thought that an index might be able to know what distinct values there are and help optime that query very much. -- Laurent Martelli [EMAIL PROTECTED]Java Aspect Components http://www.aopsys.com/ http://jac.objectweb.org ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Speeding up select distinct
> Consider this query: > > SELECT distinct owner from pictures; [...] > Any ideas, apart from more or less manually maintaining a list of > distinct owners in another table ? you answered your own question. With a 20 row owners table, you should be directing your efforts there group by is faster than distinct, but both are very wasteful and essentially require s full seqscan of the detail table. With a little hacking, you can change 'manual maintenance' to 'automatic maintenance'. 1. create table owner as select distinct owner from pictures; 2. alter table owner add constraint owner_pkey(owner); 3. alter table pictures add constraint ri_picture_owner(owner) references owner; 4. make a little append_ownder function which adds an owner to the owner table if there is not already one there. Inline this to your insert statement on pictures. Voila! Merlin p.s. normalize your data always! ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Speeding up select distinct
On Wed, 2005-03-16 at 18:58 +0100, Laurent Martelli wrote: > Consider this query: > > SELECT distinct owner from pictures; The performance has nothing to do with the number of rows returned, but rather the complexity of calculations and amount of data to sift through in order to find it. > Any ideas, apart from more or less manually maintaining a list of > distinct owners in another table ? This would be the proper thing to do, along with adding a foreign key from pictures to the new owner structure for integrity enforcement. -- ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Speeding up select distinct
Try : SELECT owner from pictures group by owner; Any ideas, apart from more or less manually maintaining a list of distinct owners in another table ? That would be a good idea too for normalizing your database. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] Speeding up select distinct
Consider this query: SELECT distinct owner from pictures; Unique (cost=361.18..382.53 rows=21 width=4) (actual time=14.197..17.639 rows=21 loops=1) -> Sort (cost=361.18..371.86 rows=4270 width=4) (actual time=14.188..15.450 rows=4270 loops=1) Sort Key: "owner" -> Seq Scan on pictures (cost=0.00..103.70 rows=4270 width=4) (actual time=0.012..5.795 rows=4270 loops=1) Total runtime: 19.147 ms I thought that 19ms to return 20 rows out of a 4000 rows table so I added an index: CREATE INDEX pictures_owner ON pictures (owner); It gives a slight improvement: Unique (cost=0.00..243.95 rows=21 width=4) (actual time=0.024..10.293 rows=21 loops=1) -> Index Scan using pictures_owner on pictures (cost=0.00..233.27 rows=4270 width=4) (actual time=0.022..8.227 rows=4270 loops=1) Total runtime: 10.369 ms But still, it's a lot for 20 rows. I looked at other type of indexes, but they seem to either not give beter perfs or be irrelevant. Any ideas, apart from more or less manually maintaining a list of distinct owners in another table ? -- Laurent Martelli [EMAIL PROTECTED]Java Aspect Components http://www.aopsys.com/ http://jac.objectweb.org ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]