Re: Experimental code generation

Matthias Boehm Thu, 20 Apr 2017 11:17:59 -0700

that is a good question. Right now we apply codegen after hop rewritesduring both initial compilation and dynamic recompilation. There areexisting rewrites that similarly do operator fusion, which could beremoved, but we'll leave them in the system for now. These existingfused operators are usually limited to 2 or 3 operators to make themgenerally applicable, whereas codegen can aggressively compile largescrip-specific operators.

Down the road, we'll aim to extend this framework to handle bothautomatic rewrites and operator fusion (aka codegen) in a holisticmanner. Such an holistic approach would allow reasoning about sideeffects, where rewrites influence fusion potential and vice versa.However, for now, I'd like to get codegen into production-ready statebefore making the next step into this direction.



Regards,
Matthias

On 4/20/2017 10:16 AM, dusenberr...@gmail.com wrote:

Excellent, I'll start experimenting with this in our deep learning work.

Question: what is the relationship between codegen and our rewrite rules?

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.

On Apr 20, 2017, at 8:32 AM, Berthold Reinwald <reinw...@us.ibm.com> wrote:

This is awesome!

Regards,
Berthold Reinwald
IBM Almaden Research Center
office: (408) 927 2208; T/L: 457 2208
e-mail: reinw...@us.ibm.com



From:   Matthias Boehm <mboe...@googlemail.com>
To:     dev@systemml.incubator.apache.org
Date:   04/20/2017 02:41 AM
Subject:        Experimental code generation



Hi all,

meanwhile our new code generation feature is sufficiently stable to enter
a
broader testing with the goal to further improve its capabilities. If
you're interesting, you can enable this feature via

<codegen.enabled>true</codegen.enabled>

in your SystemML-config.xml file. The major advantages are fewer
intermediates (read and write, incl. potentially fewer evictions), fewer
scans of inputs and intermediates, and better sparsity exploitation across
chains of operations.

On our mainstream algorithms, we see significant improvements compared to
existing fused operators for scenarios with few features, i.e., when the
vector and matrix intermediates become the bottleneck, or scripts with
missing sparsity-exploiting operations. For example, on a 100M x 10 (8GB)
scenario of L2SVM w/ 20 outer iterations, codegen improves performance
from
219s (496s without hand-coded fused operators) to 32s.

So please bring your favorite expressions. If you have interesting
scripts,
please give it a try and share any issues or patterns that we're currently
not handling very well. Thanks.


Regards,
Matthias

Re: Experimental code generation

Reply via email to