[
https://issues.apache.org/jira/browse/FLINK-11421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Liya Fan updated FLINK-11421:
-----------------------------
Labels: pull-request-available (was: )
> Providing more compilation options for code-generated operators
> ---------------------------------------------------------------
>
> Key: FLINK-11421
> URL: https://issues.apache.org/jira/browse/FLINK-11421
> Project: Flink
> Issue Type: New Feature
> Components: Table API & SQL
> Reporter: Liya Fan
> Assignee: Liya Fan
> Priority: Major
> Labels: pull-request-available
> Original Estimate: 240h
> Remaining Estimate: 240h
>
> Flink supports some operators (like Calc, Hash Agg, Hash Join, etc.) by code
> generation. That is, Flink generates their source code dynamically, and then
> compile it into Java Byte Code, which is load and executed at runtime.
>
> By default, Flink compiles the generated source code by Janino. This is fast,
> as the compilation often finishes in hundreds of milliseconds. The generated
> Java Byte Code, however, is of poor quality. To illustrate, we use Java
> Compiler API (JCA) to compile the generated code. Experiments on TPC-H (1 TB)
> queries show that the E2E time can be more than 10% shorter, when operators
> are compiled by JCA, despite that it takes more time (a few seconds) to
> compile with JCA.
>
> Therefore, we believe it is beneficial to compile generated code by JCA in
> the following scenarios: 1) For batch jobs, the E2E time is relatively long,
> so it is worth of spending more time compiling and generating high quality
> Java Byte Code. 2) For repeated stream jobs, the generated code will be
> compiled once and run many times. Therefore, it pays to spend more time
> compiling for the first time, and enjoy the high byte code qualities for
> later runs.
>
> According to the above observations, we want to provide a compilation option
> (Janino, JCA, or dynamic) for Flink, so that the user can choose the one
> suitable for their specific scenario and obtain better performance whenever
> possible.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)