Re: [PERFORM] Help with rewriting query

Junaili Lie Thu, 09 Jun 2005 18:37:53 -0700

Hi Kevin,
Thanks for the reply.
I tried that query. It definately faster, but not fast enough (took
around 50 second to complete).
I have around 2.5 million on food and 1000 on person.
Here is the query plan:
                                                   QUERY PLAN
----------------------------------------------------------------------------------------------------------------
 Merge Join  (cost=0.00..11662257.52 rows=1441579 width=16)
   Merge Cond: ("outer".id = "inner".p_id)
   ->  Index Scan using person_pkey on person p  (cost=0.00..25.17
rows=569 width=8)
   ->  Index Scan using p_id_food_index on food f 
(cost=0.00..11644211.28 rows=1441579 width=16)
         Filter: (NOT (subplan))
         SubPlan
           ->  Index Scan using p_id_food_index on food f2 
(cost=0.00..11288.47 rows=2835 width=177)
                 Index Cond: (p_id = $0)
                 Filter: (id > $1)
(9 rows)


I appreciate if you have further ideas to troubleshoot this issue.
Thank you!

On 6/8/05, Kevin Grittner <[EMAIL PROTECTED]> wrote:
> This is a pattern which I've seen many of times.  I call it a "best
> choice" query -- you can easily match a row from one table against any
> of a number of rows in another, the trick is to pick the one that
> matters most.  I've generally found that I want the query results to
> show more than the columns used for making the choice (and there can be
> many), which rules out the min/max technique.  What works in a pretty
> straitforward way, and generally optimizes at least as well as the
> alternatives, is to join to the set of candidate rows and add a "not
> exists" test to eliminate all but the best choice.
> 
> For your example, I've taken some liberties and added hypothetical
> columns from both tables to the result set, to demonstrate how that
> works.  Feel free to drop them or substitute actual columns as you see
> fit.  This will work best if there is an index for the food table on
> p_id and id.  Please let me know whether this works for you.
> 
> select p.id as p_id, p.fullname, f.id, f.foodtype, f.ts
> from food f join person p
> on f.p_id = p.id
> and not exists (select * from food f2 where f2.p_id = f.p_id and f2.id >
> f.id)
> order by p_id
> 
> Note that this construct works for inner or outer joins and works
> regardless of how complex the logic for picking the best choice is.  I
> think one reason this tends to optimize well is that an EXISTS test can
> finish as soon as it finds one matching row.
> 
> -Kevin
> 
> 
> >>> Junaili Lie <[EMAIL PROTECTED]> 06/08/05 2:34 PM >>>
> Hi,
> I have the following table:
> person - primary key id, and some attributes
> food - primary key id, foreign key p_id reference to table person.
> 
> table food store all the food that a person is eating. The more recent
> food is indicated by the higher food.id.
> 
> I need to find what is the most recent food a person ate for every
> person.
> The query:
> select f.p_id, max(f.id) from person p, food f where p.id=f.p_id group
> by f.p_id will work.
> But I understand this is not the most efficient way. Is there another
> way to rewrite this query? (maybe one that involves order by desc
> limit 1)
> 
> Thank you in advance.
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
> your
>      joining column's datatypes do not match
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
>               http://www.postgresql.org/docs/faq
>

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Re: [PERFORM] Help with rewriting query

Reply via email to