[
https://issues.apache.org/jira/browse/DRILL-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Paul Rogers updated DRILL-5070:
-------------------------------
Description:
A handy technique in testing is to compare generated code against a "golden"
copy that defines the expected results. However, at present, Drill generates
code using the method order returned by {{Class.getDeclaredMethods}}, but this
method makes no guarantee about the order of the methods. The order varies from
one run to the next. There is some evidence [this
link|http://stackoverflow.com/questions/28585843/java-reflection-getdeclaredmethods-in-declared-order-strange-behaviour]
that order can vary even within a single run, though a quick test was unable
to reproduce this case.
If method order does indeed vary within a single run, then the order can impact
the Drill code cache since it compares the sources from two different
generation events to detect duplicate code.
This issue appeared when attempting to modify tests to capture generated code
for comparison to future results. Even a simple generated case from
{{ExpressionTest.testBasicExpression()}} that generates {{if(true) then 1 else
0 end}} (all constants) produced methods in different orders on each test run.
The fix is simple, in the {{SignatureHolder}} constructor, sort methods by name
after retrieving them from the class. The sort ensures that method order is
deterministic. Fortunately, the number of methods is small, so the sort step
adds little cost.
was:
The Drill generated code cache compares the sources from two different
generation events to detect duplicate code. Unfortunately, the code generator
emits methods in the order returned by {{Class.getDeclaredMethods}}, but this
method makes no guarantee about the order of the methods.
This issue appeared when attempting to modify tests to capture generated code
for comparison to future results. Even a simple generated case from
{{ExpressionTest.testBasicExpression()}} that generates {{if(true) then 1 else
0 end}} (all constants) produced methods in different orders on each test run.
The fix is simple, in the {{SignatureHolder}} constructor, sort methods by name
after retrieving them from the class. The sort ensures that method order is
deterministic. Fortunately, the number of methods is small, so the sort step
adds little cost.
Without this fix, it is likely that the code cache holds many "copies" of the
same code: equivalent code but with different method orders. After this fix,
the cache should hold only one copy of each bit of equivalent code.
> Code gen: create methods in fixed order to allow test verification
> ------------------------------------------------------------------
>
> Key: DRILL-5070
> URL: https://issues.apache.org/jira/browse/DRILL-5070
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.8.0
> Reporter: Paul Rogers
> Priority: Minor
>
> A handy technique in testing is to compare generated code against a "golden"
> copy that defines the expected results. However, at present, Drill generates
> code using the method order returned by {{Class.getDeclaredMethods}}, but
> this method makes no guarantee about the order of the methods. The order
> varies from one run to the next. There is some evidence [this
> link|http://stackoverflow.com/questions/28585843/java-reflection-getdeclaredmethods-in-declared-order-strange-behaviour]
> that order can vary even within a single run, though a quick test was unable
> to reproduce this case.
> If method order does indeed vary within a single run, then the order can
> impact the Drill code cache since it compares the sources from two different
> generation events to detect duplicate code.
> This issue appeared when attempting to modify tests to capture generated code
> for comparison to future results. Even a simple generated case from
> {{ExpressionTest.testBasicExpression()}} that generates {{if(true) then 1
> else 0 end}} (all constants) produced methods in different orders on each
> test run.
> The fix is simple, in the {{SignatureHolder}} constructor, sort methods by
> name after retrieving them from the class. The sort ensures that method order
> is deterministic. Fortunately, the number of methods is small, so the sort
> step adds little cost.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)