Re: Optimizing subqueries [ Was: Re: VTI, Indexed Lookup and the Query Optimizer ]

Army Tue, 17 Jan 2006 11:28:38 -0800

Jeffrey Lichtman wrote:

Based on logic in the code, the example query isn't flattenable. . .
That's because whoever wrote the code made it handle only the simplestcase. I doubt it would be hard to make it flatten many other types oftable subqueries.

The example I gave was a simplified scenario to show how a PRN can end up with aSelectNode beneath it--which was (I believe?) the example requested by Satheesh.The actual query that prompted this question, though, has a subquery that usesaggregates and a GROUP BY--i.e. the subquery *cannot*, as I understand it, beflattened into the outer query, because the aggregate/group-by functionality hasto be performed before evaluation of the outer query can occur. Ex.

select t1.i, x1.s1 from t1 inner join (select distinct j, sum(b) s1 from t2group by j) x1 on x1.j = t1.i;

In a case like this, where the subquery _can't_ be flattened, it still seems tome that a hash join could be beneficial--but because of the logic inProjectRestrictNode.isMaterializable(), the hash join isn't allowed. So whatI'm wondering is _why_ is that logic there? That is, when a subquery cannot beflattened into an outer query, the optimizer always considers a hash join to beinfeasible. Why is that?

If the answer is simply that "no one has looked at removing this restrictionyet", then that's fine--that's what I want to know. If, however, there is adeliberate reason for leaving this restriction in place, I was hoping someoneout there knew what that reason was. The comments in PRN.isMaterializble() seemto suggest this wasn't meant to be a permanent restriction, so my guess is that"no one has done it yet" is the correct answer.

My general philosophy toward query performance issues is that I prefermassaging the query into a standard form and letting the optimizerhandle it to putting in special-case logic for certain types of queries.

I agree, avoiding special-case is good. Which is why the special-case logic forPRN's over non-optimizable child nodes in isMaterializable() seems odd to me,and hence my question.

The optimizer can do things the rest of query processing would have adifficult time with. For example, if an inner join in a subquery isflattened into the outer query, the optimizer is free to put the tablesfrom the subquery anywhere in the join order, even if it meansinterspersing the subquery's tables with the outer query's tables.

This is, as you say, a good reason to look at increasing Derby's ability toflatten subqueries. But in cases where subqueries simply cannot be flattened, Ithink the restriction in PRN.isMaterializable() is still going to be a cause forsub-optimal performance, because it disallows hash joins where they couldpotentially be useful.

Thanks for your patience with my questions on this topic; I'm just trying to geta grasp on how this all is supposed to work...


Army

Re: Optimizing subqueries [ Was: Re: VTI, Indexed Lookup and the Query Optimizer ]

Reply via email to