that is a good question. Yes, if we want to enable code generation in such
a scenario it would also need Janino, which increases our footprint by
roughly 0.6MB.

Btw, Janino fits much better into such an in-memory deployment because it
compiles classes in-memory without the need to write class files into a
local working directory. The same could be done for
javax.tools.JavaCompiler, but would require to custom in-memory
JavaFileManager.

Regards,
Matthias

On Fri, Mar 31, 2017 at 9:14 PM, Berthold Reinwald <reinw...@us.ibm.com>
wrote:

> Sounds like a good idea.
>
> Wrt codegen, in a pure Java scoring environment w/o Spark and Hadoop, will
> the dependency on Janino still be there (that question applies to JDK as
> well), and what is the footprint?
>
> Regards,
> Berthold Reinwald
> IBM Almaden Research Center
> office: (408) 927 2208; T/L: 457 2208
> e-mail: reinw...@us.ibm.com
>
>
>
> From:   Matthias Boehm <mboe...@googlemail.com>
> To:     dev@systemml.incubator.apache.org
> Date:   03/31/2017 08:17 PM
> Subject:        Java compiler for code generation
>
>
>
> Hi all,
>
> currently, our new code generator for operator fusion, uses the
> programmatic javax.tools.JavaCompiler, which is Java's standard API for
> compilation. Despite a plan cache that mitigates unnecessary compilation
> and recompilation overheads, we still see significant end-to-end overhead
> especially for small input data.
>
> Moving forward, I'd like to switch to Janino
> (org.codehaus.janino.SimpleCompiler), which is a fast in-memory Java
> compiler with restricted language support. The advantages are
>
> (1) Reduced compilation overhead: On end-to-end scenarios for L2SVM, GLM,
> and MLogreg, Janino improved total javac compilation time from 2.039 to
> 0.195 (14 operators), from 8.134 to 0.411 (82 operators), and from 4.854
> to
> 0.283 (46 operators), respectively. At the same time, there was no
> measurable impact on runtime efficiency, but even slightly reduced JIT
> compilation overhead.
>
> (2) Removed JDK requirement: Using the standard javax.tools.JavaCompiler
> requires the existence of a JDK, while Janino only requires a JRE, which
> means it makes it easier to apply code generation by default.
>
> However, I'm raising this here as Janino would add another explicit
> dependency (with BSD license). Fortunately, Spark also uses Janino for
> whole-stage-codegen. So we should be able to mark Janino as provided
> library. The only issue is a pure Hadoop environment, where we still want
> to use code generation for CP operations. To simplify the build, I could
> imagine using the javax.tools.JavaCompiler for hadoop execution types, but
> Janino by default.
>
> If you have any concerns, please let me know by Monday; otherwise I'd like
> to push this change into our upcoming 0.14 release.
>
>
> Regards,
> Matthias
>
>
>
>
>

Reply via email to