[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947906#comment-14947906 ]
Steven Phillips commented on DRILL-3912: ---------------------------------------- 1) I had not enabled CSE in hash join, so it didn't have that problem. Now that I have enabled in hash join, I am seeing the same SR error. 2) In this case, it looks like the ConstantFilter is causing the '1 + 2' and '1 + 3' parts of the expressions to be resolved first, and then 'a + 1' is no longer common. Duplicate vectors reads are removed, though. I think this behavior is probably fine. 3) I am not targeting this for 1.2. Probably for 1.3. My main motivation here was to solve a problem I was running into in my Union-type work. Function resolution when there is Union type for the input involves case statements that check the current type of the input, and then executes a branch based on that type. In this case, both the condition expression as well as both branches will reference the input. For example, 1 + a would become something like {code} case when typeOf(a) = int then 1 + cast(a as int) when typeOf(a) = varchar then 1 + cast(cast(a as varchar) as int) end {code} So you can see that a single reference to 'a' becomes 3 references. And 'a' might not just be a ValueVectorReadExpression, it could be the output from some other expression tree. And if an input has more than 2 types, or if a function has multiple Union-type inputs, the complexity of the expression increases dramatically, and the amount of generated code gets to be quite large. I needed to find some way to fix this. > Common subexpression elimination in code generation > --------------------------------------------------- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Bug > Reporter: Steven Phillips > Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)