Hi all,

meanwhile our new code generation feature is sufficiently stable to enter a
broader testing with the goal to further improve its capabilities. If
you're interesting, you can enable this feature via

<codegen.enabled>true</codegen.enabled>

in your SystemML-config.xml file. The major advantages are fewer
intermediates (read and write, incl. potentially fewer evictions), fewer
scans of inputs and intermediates, and better sparsity exploitation across
chains of operations.

On our mainstream algorithms, we see significant improvements compared to
existing fused operators for scenarios with few features, i.e., when the
vector and matrix intermediates become the bottleneck, or scripts with
missing sparsity-exploiting operations. For example, on a 100M x 10 (8GB)
scenario of L2SVM w/ 20 outer iterations, codegen improves performance from
219s (496s without hand-coded fused operators) to 32s.

So please bring your favorite expressions. If you have interesting scripts,
please give it a try and share any issues or patterns that we're currently
not handling very well. Thanks.


Regards,
Matthias

Reply via email to