Re: [PERFORM] why is a constraint not 'pushed down' into a subselect when this subselect is using a 'group by' ??

Frank van Vugt Tue, 15 Jun 2004 08:36:24 -0700

> Obviously this is on toy tables

The query is simplified, yes. But the data in the tables is real, albeit 
they're not that large.


> You're misinterpreting it.

I might very well be ;)
But I also get the feeling I didn't explain to you well enough what I meant...

> Without the group by, the plan is a candidate for 
nestloop-with-inner-index-scan

Yes, I understand that. I only ditched the group by to check whether the 
contraint on the article table was indeed recognized as a constraint on the 
package table based on 'article.id = foo.article_id'. And it is/was.

> with the group by, there's another step in the way.

Yep, but on my system, package gets seq-scanned *without* any additional 
constraint, resulting in a loooooong processing time.

> Pushing down into subselects does get done, for instance in CVS tip
> I can change the last part of your query to "foo.article_id < 50"
> and get ...

This is why I think I wasn't clear enough.

In the real thing, the constraint on the article table is built by some 
external source and I cannot easily make assumptions to translate these to a 
constraint on the package table, especially since I expect the planner to be 
far better in that than me ;)

So, my base query is this:

        select 
                article.id, p_min 
        from 
                article, 
                (select 
                        article_id, min(amount) as p_min 
                from 
                        package 
                group by 
                        article_id
                ) as foo 
        where 
                article.id = foo.article_id and 
                <some constraint on article table>;


Now, when <constraint> = true, this obviously results in seqscans:

 Hash Join  (cost=1106.79..1251.46 rows=4452 width=8)
   Hash Cond: ("outer".article_id = "inner".id)
   ->  Subquery Scan foo  (cost=726.10..781.74 rows=4451 width=8)
         ->  HashAggregate  (cost=726.10..737.23 rows=4451 width=8)
               ->  Seq Scan on package  (cost=0.00..635.40 rows=18140 width=8)
   ->  Hash  (cost=369.35..369.35 rows=4535 width=4)
         ->  Seq Scan on article  (cost=0.00..369.35 rows=4535 width=4)

But when <constraint> = 'article.id < 50', only article is indexscanned:

 Hash Join  (cost=730.11..808.02 rows=1 width=8)
   Hash Cond: ("outer".article_id = "inner".id)
   ->  Subquery Scan foo  (cost=726.10..781.74 rows=4451 width=8)
         ->  HashAggregate  (cost=726.10..737.23 rows=4451 width=8)
               ->  Seq Scan on package  (cost=0.00..635.40 rows=18140 width=8)
   ->  Hash  (cost=4.01..4.01 rows=1 width=4)
         ->  Index Scan using article_pkey on article  (cost=0.00..4.01 rows=1 
width=4)
               Index Cond: (id < 50)


Which still results in poor performance due to the seqscan on package.

Putting the constraint on package is boosting performance indeed, but I cannot 
make that assumption.

So, what I was asking was:

When the 'article.id < 50' constraint is added, it follows that 
'foo.article_id < 50' is a constraint as well. Why is this constraint not 
used to avoid the seqscan on package?

> Obviously this is on toy tables, but the point is that the constraint
> does get pushed down through the GROUP BY when appropriate.

I've seen it being pushed down when it already was defined as a constraint on 
the group by, like in your example. 

If necessary, I'll throw together a few commands that build some example 
tables to show what I mean.



-- 
Best,




Frank.


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [PERFORM] why is a constraint not 'pushed down' into a subselect when this subselect is using a 'group by' ??

Reply via email to