Re: [PERFORM] is it possible to for the planner to optimize this form?

2004-06-07 Thread Josh Berkus
Merlin,

 select t.key, t.field from t a
 where
 (
 select count(*) from t b
 where b.field  a.field
 ) = k

 The subplan (either index or seq. scan) executes once for each row in t,
 which of course takes forever.

 This query is a way of achieving LIMIT type results (substitute n-1
 desired rows for k) using standard SQL, which is desirable in some
 circumstances.  Is it theoretically possible for this to be optimized?

I don't think so, no.   PostgreSQL does have some issues using indexes for 
count() queires which makes the situation worse.   However, with the query 
you presented, I don't see any way around the planner executing the subquery 
once for every row in t.

Except, of course, for some kind of scheme involving materialized views, if 
you don't need up-to-the minute data.   In that case, you could store in a 
table the count(*)s of t for each threshold value of b.field.  But, 
dynamically, that would be even slower.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] is it possible to for the planner to optimize this

2004-06-07 Thread Duane Lee - EGOVX
Title: RE: [PERFORM] is it possible to for the planner to optimize this form?





I didn't really look that closely at the problem but have you thought of trying:


select t.key, t.field from t a
 , (select count(*) as cntb from t b
 where b.field  a.field) as dmytbl
where
cntb = k


This is called an inline view or sometimes a nested table. You would be joining table t to this inline view with the join criteria being cntb = k where k is in t.


-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED]]
Sent: Monday, June 07, 2004 1:32 PM
To: Merlin Moncure; [EMAIL PROTECTED]
Subject: Re: [PERFORM] is it possible to for the planner to optimize
this form?



Merlin,


 select t.key, t.field from t a
 where
 (
 select count(*) from t b
 where b.field  a.field
 ) = k

 The subplan (either index or seq. scan) executes once for each row in t,
 which of course takes forever.

 This query is a way of achieving LIMIT type results (substitute n-1
 desired rows for k) using standard SQL, which is desirable in some
 circumstances. Is it theoretically possible for this to be optimized?


I don't think so, no. PostgreSQL does have some issues using indexes for 
count() queires which makes the situation worse. However, with the query 
you presented, I don't see any way around the planner executing the subquery 
once for every row in t.


Except, of course, for some kind of scheme involving materialized views, if 
you don't need up-to-the minute data. In that case, you could store in a 
table the count(*)s of t for each threshold value of b.field. But, 
dynamically, that would be even slower.


-- 
Josh Berkus
Aglio Database Solutions
San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly