Paul Rogers created DRILL-5071:
----------------------------------
Summary: CodeGenerator class unnecessarily keeps two copies of
generated code
Key: DRILL-5071
URL: https://issues.apache.org/jira/browse/DRILL-5071
Project: Apache Drill
Issue Type: Improvement
Affects Versions: 1.8.0
Reporter: Paul Rogers
Priority: Minor
Drill uses a code cache to avoid recompiling the same code multiple times. The
cache is keyed on the generated code itself.
The generated code contains an ever-increasing name suffix of the form
{{ProjectorGen123}}.
The unique name would be necessary if generated code shared a single name
space. But, as currently implemented, each bit of generated code resides in its
own private class loader: the code generated for one operator (say) can never
class with that for another.
As a result, we can reduce the size and cost of the code cache by:
1. Eliminate the numeric suffix on the class name.
2. Eliminate the {{generifiedCode}} member variable in {{CodeGenerator}}.
3. Eliminate the search and replace that produces the "generified" code.
4. Use the actual generated code as the cache key instead of the "generified"
version.
5. Rely on the distinct class loaders to keep generated class names separate.
The code cache holds up to 1000 classes. Classes can range from a few K to
hundreds of K. By eliminating the second code copy, we may reduce heap memory
pressure on the order of 50K * 1000 = 50 MB or so.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)