subject:"\[PERFORM\] Speeding up select distinct"

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread Greg Stark

Laurent Martelli <[EMAIL PROTECTED]> writes:

>   PFC>SELECT owner from pictures group by owner;
> 
> That's a slight improvement, but there's still a seq scan on pictures:

It should be a sequential scan. An index will be slower.

>  HashAggregate  (cost=114.38..114.38 rows=21 width=4) (actual 
> time=7.585..7.605 rows=21 loops=1)
>->  Seq Scan on pictures  (cost=0.00..103.70 rows=4270 width=4) (actual 
> time=0.015..3.272 rows=4270 loops=1)
>  Total runtime: 7.719 ms

That's the best plan for this query.

-- 
greg


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread Merlin Moncure

> I just wished there was a means to fully automate all this and render
> it transparent to the user, just like an index.
> 
>   Merlin> Voila!  Merlin p.s. normalize your data always!
> 
> I have this:
> 
> pictures(
> PictureID serial PRIMARY KEY,
> Owner integer NOT NULL REFERENCES users,
> [...]);
> CREATE TABLE users (
> UserID serial PRIMARY KEY,
> Name character varying(255),
> [...]);
> 
> Isn't it normalized ?

try:
select * from users where UserID in (select pictureId from pictures);
select * userid from users intersect select pictureid from pictures;
select distinct userid, [...] from users, pictures where user userid =
pictureid)

if none of these give you what you want then you can solve this with a
new tble, picture_user using the instructions I gave previously.

Not sure if your data is normalized, but ISTM you are over-using
surrogate keys.  It may not be possible, but consider downgrading ID
columns to unique and picking a natural key.  Now you get better benefit
of RI and you can sometimes remove joins from certain queries.

Rule: use natural keys when you can, surrogate keys when you have to.
Corollary: use domains for fields used in referential integrity.

Merlin


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread Rod Taylor

On Wed, 2005-03-16 at 19:31 +0100, Laurent Martelli wrote:
> > "Rod" == Rod Taylor <[EMAIL PROTECTED]> writes:
> 
>   Rod> On Wed, 2005-03-16 at 18:58 +0100, Laurent Martelli wrote:
>   >> Consider this query:
>   >> 
>   >> SELECT distinct owner from pictures;
> 
>   Rod> The performance has nothing to do with the number of rows
>   Rod> returned, but rather the complexity of calculations and amount
>   Rod> of data to sift through in order to find it.
> 
> Yes, but I thought that an index might be able to know what distinct
> values there are and help optime that query very much.

The index does know. You just have to visit all of the pages within the
index to find out, which it does, and that's why you dropped 10ms.

But if you want a sub ms query, you're going to have to normalize the
structure.

-- 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread Laurent Martelli

> "Merlin" == Merlin Moncure <[EMAIL PROTECTED]> writes:

  >> Consider this query:
  >> 
  >> SELECT distinct owner from pictures;

  Merlin> [...]
  >> Any ideas, apart from more or less manually maintaining a list of
  >> distinct owners in another table ?

  Merlin> you answered your own question.  With a 20 row owners table,
  Merlin> you should be directing your efforts there group by is
  Merlin> faster than distinct, but both are very wasteful and
  Merlin> essentially require s full seqscan of the detail table.

  Merlin> With a little hacking, you can change 'manual maintenance'
  Merlin> to 'automatic maintenance'.

  Merlin> 1. create table owner as select distinct owner from
  Merlin> pictures; 2. alter table owner add constraint
  Merlin> owner_pkey(owner); 3. alter table pictures add constraint
  Merlin> ri_picture_owner(owner) references owner; 4. make a little
  Merlin> append_ownder function which adds an owner to the owner
  Merlin> table if there is not already one there. Inline this to your
  Merlin> insert statement on pictures.

I just wished there was a means to fully automate all this and render
it transparent to the user, just like an index.

  Merlin> Voila!  Merlin p.s. normalize your data always!

I have this:

pictures(
PictureID serial PRIMARY KEY,
Owner integer NOT NULL REFERENCES users,
[...]);
CREATE TABLE users (
UserID serial PRIMARY KEY,
Name character varying(255),
[...]);

Isn't it normalized ?

-- 
Laurent Martelli
[EMAIL PROTECTED]Java Aspect Components
http://www.aopsys.com/  http://jac.objectweb.org

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread Laurent Martelli

Wow, what a fast response !!!

> "PFC" == PFC  <[EMAIL PROTECTED]> writes:

  PFC>  Try :

  PFC>  SELECT owner from pictures group by owner;

That's a slight improvement, but there's still a seq scan on pictures:

 HashAggregate  (cost=114.38..114.38 rows=21 width=4) (actual time=7.585..7.605 
rows=21 loops=1)
   ->  Seq Scan on pictures  (cost=0.00..103.70 rows=4270 width=4) (actual 
time=0.015..3.272 rows=4270 loops=1)
 Total runtime: 7.719 ms




-- 
Laurent Martelli
[EMAIL PROTECTED]Java Aspect Components
http://www.aopsys.com/  http://jac.objectweb.org


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread Laurent Martelli

> "Rod" == Rod Taylor <[EMAIL PROTECTED]> writes:

  Rod> On Wed, 2005-03-16 at 18:58 +0100, Laurent Martelli wrote:
  >> Consider this query:
  >> 
  >> SELECT distinct owner from pictures;

  Rod> The performance has nothing to do with the number of rows
  Rod> returned, but rather the complexity of calculations and amount
  Rod> of data to sift through in order to find it.

Yes, but I thought that an index might be able to know what distinct
values there are and help optime that query very much.

-- 
Laurent Martelli
[EMAIL PROTECTED]Java Aspect Components
http://www.aopsys.com/  http://jac.objectweb.org

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread Merlin Moncure

> Consider this query:
> 
> SELECT distinct owner from pictures;

[...]
> Any ideas, apart from more or less manually maintaining a list of
> distinct owners in another table ?

you answered your own question.  With a 20 row owners table, you should
be directing your efforts there group by is faster than distinct, but
both are very wasteful and essentially require s full seqscan of the
detail table.  

With a little hacking, you can change 'manual maintenance' to 'automatic
maintenance'.

1. create table owner as select distinct owner from pictures;
2. alter table owner add constraint owner_pkey(owner);
3. alter table pictures add constraint ri_picture_owner(owner)
references owner;
4. make a little append_ownder function which adds an owner to the owner
table if there is not already one there. Inline this to your insert
statement on pictures.

Voila!
Merlin
p.s. normalize your data always!

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread Rod Taylor

On Wed, 2005-03-16 at 18:58 +0100, Laurent Martelli wrote:
> Consider this query:
> 
> SELECT distinct owner from pictures; 

The performance has nothing to do with the number of rows returned, but
rather the complexity of calculations and amount of data to sift through
in order to find it.

> Any ideas, apart from more or less manually maintaining a list of
> distinct owners in another table ?

This would be the proper thing to do, along with adding a foreign key
from pictures to the new owner structure for integrity enforcement.
-- 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [PERFORM] Speeding up select distinct

2005-03-16 Thread PFC

Try :
SELECT owner from pictures group by owner;
Any ideas, apart from more or less manually maintaining a list of
distinct owners in another table ?
That would be a good idea too for normalizing your database.
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

[PERFORM] Speeding up select distinct

2005-03-16 Thread Laurent Martelli

Consider this query:

SELECT distinct owner from pictures; 

 Unique  (cost=361.18..382.53 rows=21 width=4) (actual time=14.197..17.639 
rows=21 loops=1)
   ->  Sort  (cost=361.18..371.86 rows=4270 width=4) (actual 
time=14.188..15.450 rows=4270 loops=1)
 Sort Key: "owner"
 ->  Seq Scan on pictures  (cost=0.00..103.70 rows=4270 width=4) 
(actual time=0.012..5.795 rows=4270 loops=1)
 Total runtime: 19.147 ms

I thought that 19ms to return 20 rows out of a 4000 rows table so I
added an index:

CREATE INDEX pictures_owner ON pictures (owner);

It gives a slight improvement:

 Unique  (cost=0.00..243.95 rows=21 width=4) (actual time=0.024..10.293 rows=21 
loops=1)
   ->  Index Scan using pictures_owner on pictures  (cost=0.00..233.27 
rows=4270 width=4) (actual time=0.022..8.227 rows=4270 loops=1)
 Total runtime: 10.369 ms

But still, it's a lot for 20 rows. I looked at other type of indexes,
but they seem to either not give beter perfs or be irrelevant. 

Any ideas, apart from more or less manually maintaining a list of
distinct owners in another table ?

-- 
Laurent Martelli
[EMAIL PROTECTED]Java Aspect Components
http://www.aopsys.com/  http://jac.objectweb.org


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [PERFORM] Speeding up select distinct

Re: [PERFORM] Speeding up select distinct

Re: [PERFORM] Speeding up select distinct

Re: [PERFORM] Speeding up select distinct

Re: [PERFORM] Speeding up select distinct

Re: [PERFORM] Speeding up select distinct

Re: [PERFORM] Speeding up select distinct

Re: [PERFORM] Speeding up select distinct

Re: [PERFORM] Speeding up select distinct

[PERFORM] Speeding up select distinct

10 matches

Site Navigation

Mail list logo

Footer information