[
https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947906#comment-14947906
]
Steven Phillips commented on DRILL-3912:
----------------------------------------
1) I had not enabled CSE in hash join, so it didn't have that problem. Now that
I have enabled in hash join, I am seeing the same SR error.
2) In this case, it looks like the ConstantFilter is causing the '1 + 2' and '1
+ 3' parts of the expressions to be resolved first, and then 'a + 1' is no
longer common. Duplicate vectors reads are removed, though. I think this
behavior is probably fine.
3) I am not targeting this for 1.2. Probably for 1.3. My main motivation here
was to solve a problem I was running into in my Union-type work. Function
resolution when there is Union type for the input involves case statements that
check the current type of the input, and then executes a branch based on that
type. In this case, both the condition expression as well as both branches will
reference the input. For example,
1 + a
would become something like
{code}
case when typeOf(a) = int
then 1 + cast(a as int)
when typeOf(a) = varchar
then 1 + cast(cast(a as varchar) as int)
end
{code}
So you can see that a single reference to 'a' becomes 3 references. And 'a'
might not just be a ValueVectorReadExpression, it could be the output from some
other expression tree. And if an input has more than 2 types, or if a function
has multiple Union-type inputs, the complexity of the expression increases
dramatically, and the amount of generated code gets to be quite large. I needed
to find some way to fix this.
> Common subexpression elimination in code generation
> ---------------------------------------------------
>
> Key: DRILL-3912
> URL: https://issues.apache.org/jira/browse/DRILL-3912
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Steven Phillips
> Assignee: Jinfeng Ni
>
> Drill currently will evaluate the full expression tree, even if there are
> redundant subtrees. Many of these redundant evaluations can be eliminated by
> reusing the results from previously evaluated expression trees.
> For example,
> {code}
> select a + 1, (a + 1)* (a - 1) from t
> {code}
> Will compute the entire (a + 1) expression twice. With CSE, it will only be
> evaluated once.
> The benefit will be reducing the work done when evaluating expressions, as
> well as reducing the amount of code that is generated, which could also lead
> to better JIT optimization.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)