Hi all, with our SystemML 1.0 release around the corner, I think we should start the discussion on the roadmap for SystemML 1.1 and beyond. Below is an initial list as a starting point, but please help to add relevant items, especially for algorithms and APIs, which are barely covered so far.
1) Deep Learning * Full compiler integration GPU backend * Extended sparse operations on CPU/GPU * Extended single-precision support CPU * Distributed DL operations? 2) GPU Backend * Full support for sparse operations * Automatic decisions on CPU vs GPU operations * Graduate GPU backends (enable by default) 3) Code generation * Graduate code generation (enable by default) * Support for deep learning operations * Code generation for the heterogeneous HW, incl GPUs 4) Compressed Linear Algebra * Support for matrix-matrix multiplications * Support for deep learning operations * Improvements for ultra-sparse datasets 5) Misc Runtime * Large dense matrix blocks > 16GB * NUMA-awareness (thread pools, matrix partitioning) * Unified memory management (ops, bufferpool, RDDs/broadcasts) * Support feather format for matrices and frames * Parfor support for broadcasts * Extended support for multi-threaded operations * Boolean matrices 6) Misc Compiler * Support single-output UDFs in expressions * Consolidate replicated compilation chain (e.g., diff APIs) * Holistic sum-product optimization and operator fusion * Extended sparsity estimators * Rewrites and compiler improvements for mini-batching * Parfor optimizer support for shared reads 7) APIs * Python Binding for JMLC API * Consistency Python/Java APIs Regards, Matthias
