[ 
https://issues.apache.org/jira/browse/DRILL-5070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15700579#comment-15700579
 ] 

Paul Rogers commented on DRILL-5070:
------------------------------------

Analysis of impact. Method order is random. If a generated class has n method, 
then there are n! possible orderings, and the cache holds up to n! different 
variations when one would do. Reduction in each size for various method counts 
(assuming each variation ends up being cached):

* 2 methods: 1/2
* 3 methods: 1/6
* 4 methods: 1/24

That is, a class with 3 methods will store 1/6 the number of variations after 
the fix as before.

Since the cache is cluttered with fewer functionally duplicate classes, more 
room will be available to preserve other classes, potentially further reducing 
the amount of compilation required.

> Code cache compares sources, but method order varies
> ----------------------------------------------------
>
>                 Key: DRILL-5070
>                 URL: https://issues.apache.org/jira/browse/DRILL-5070
>             Project: Apache Drill
>          Issue Type: Bug
>    Affects Versions: 1.8.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> The Drill generated code cache compares the sources from two different 
> generation events to detect duplicate code. Unfortunately, the code generator 
> emits methods in the order returned by {{Class.getDeclaredMethods}}, but this 
> method makes no guarantee about the order of the methods.
> This issue appeared when attempting to modify tests to capture generated code 
> for comparison to future results. Even a simple generated case from 
> {{ExpressionTest.testBasicExpression()}} that generates {{if(true) then 1 
> else 0 end}} (all constants) produced methods in different orders on each 
> test run.
> The fix is simple, in the {{SignatureHolder}} constructor, sort methods by 
> name after retrieving them from the class. The sort ensures that method order 
> is deterministic. Fortunately, the number of methods is small, so the sort 
> step adds little cost.
> Without this fix, it is likely that the code cache holds many "copies" of the 
> same code: equivalent code but with different method orders. After this fix, 
> the cache should hold only one copy of each bit of equivalent code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to