As part of my work on DERBY-2998, the implementation of the ROW_NUMBER() window function, I'm having trouble with the optimizer and its (lack of) pushing predicates down the query tree.
Hi Thomas & Army, this is a very interesting discussion and I'm learning a lot by following it, so please keep sending this useful information along as that way we all get a deeper understanding!
It seems to me that there are perhaps several types of restrictions: - Restrictions that can end up being pushed all the way down into the store, to be satisfied by index qualification. The optimization win by using an index to satisfy a where clause is so high that the optimizer probably pays special attention to these. - Restrictions that end up by implemented by generating code into the "restriction" method of the activation's generated code, which is then invoked dynamically by various ResultSet implementations during getNextRowCore(). I imagine that the optimization win by shoving these sort of restrictions around in the tree may be smaller, so perhaps the optimizer isn't so aggressive about considering alternate positions for them? That is, perhaps the "if this is a simple column reference" test in PredicateList.pushExpressionIntoSelect is being used as a heuristic for "if this predicate might be something we could potentially use as an index qualifier"? With the new restriction approach that you are proposing for DERBY-2998, am I correct that you are suggesting we might invent a new ResultSet node implementation, which computes the ROW_NUMBER() value for its rows? If I am understanding that correctly, are you envisioning that a WHERE clause which refers to the ROW_NUMBER() result will be implemented by generating comparison code into the "restriction" method of the generated activation code? It seems to me that the (potential) optimization win by performing the ROW_NUMBER() restriction at the right level of the execution tree is quite high (in fact, it's the whole point of DERBY-581), so it makes sense to me that it would be worth the optimizer's effort to consider alternate locations for it, and thus you seem to be heading down the right path with your research. I guess I'm mostly just trying to educate myself here, but hopefully some of these ideas and questions will help you clarify your studies and help us all understand how this processing works. thanks, bryan
