Hi All, Thanks for fixing the bugs. A few tasks/bugs (see below) are still being worked on and hopefully will be closed in a couple of days. And we should be all ready to distribute the release candidates by the end of this week.
- python tutorials, - failing gpu tests, - error in loading native BLAS. Regards, Arnab.. On Thu, Sep 10, 2020 at 11:24 AM arnab phani <phaniar...@gmail.com> wrote: > Thank you all for the notes. > Please find the consolidated release notes below, and please let me know > if anything major is missing. > > *Release notes for SystemDS 2.0.* > > SystemDS 2.0 is the first major release under the new name. This release > contains a major refactoring, a few major features, a large number of > improvements and fixes, and some experimental features to better support > the end-to-end data science lifecycle. In addition to that, this release > also removes several features that are not up to the mark and outdated. > > The major changes (compared to SystemML 1.2) include > > > - New mechanism for DML-bodied (script-level) builtin functions, and a > wealth of new built-in functions for data preprocessing including data > cleaning, augmentation and feature engineering techniques, new ML > algorithms, and model debugging. > - Several methods for data cleaning have been implemented including > multiple imputations with multivariate imputation by chained equations > (MICE) and other techniques, SMOTE, an oversampling technique for class > imbalance, forward and backward NA filling, cleaning using schema and > length information, support for outlier detection using standard deviation > and inter-quartile range, and functional dependency discovery. > - A complete framework for lineage tracing and reuse including support > for loop deduplication, full and partial reuse, compiler assisted reuse, > several new rewrites to facilitate reuse. > - New federated runtime backend including support for federated > matrices and frames, federated builtins (transform-encode, decode etc.). > - Refactor compression package and add functionalities including > quantization for lossy compression, binary cell operations, left matrix > multiplication. > - New python bindings with supports for several builtins, matrix > operations, federated tensors, and lineage traces. > - Cuda implementation of cumulative aggregate operators (cumsum, > cumprod etc.) > - New model debugging technique with slice finder. > - New tensor data model (basic tensors of different value types, data > tensors with schema) [experimental] > - Cloud deployment scripts for AWS and scripts to set up and start > federated operations. > - Performance improvements with parallel sort, gpu cum agg, append > cbind etc. > - Various compiler and runtime improvements including new and > improved rewrites, reduced Spark context creation, new eval framework, list > operations, updated native kernel libraries to name a few. > - New data reader/writer for json frames and support for sql as a data > source. > - Miscellaneous improvements: improved documentation, better testing, > run/release scripts, improved packaging, Docker container for systemds, bug > fixes. > - Removed MapReduce compiler and runtime backend, pydml parser, > Java-UDF framework, script-level debugger. > > > Regards, > Arnab. > > > On Tue, Sep 8, 2020 at 4:10 AM Mark Dokter <mdok...@know-center.at> wrote: > >> On 01.09.20 11:36, arnab phani wrote: >> > While I will aggregate the notes from two SystemDS releases, it will be >> > great if you can update me with a few lines summarizing the additions to >> > your features (including the external contributions), especially after >> > March 24, 2020 (last SystemDS release). >> >> Hi Arnab! >> >> My contributions: >> >> - new run script >> - improve/simplify release scripts >> - various release related things (improve documentation, fix license >> headers, clean up pom.xml, etc) >> - cuda implementation of cumulative aggregate operators (cumsum, >> cumprod, etc) >> - bug fixes here and there >> - maintain native blas support in a working state (now also supporting >> windows) >> - kmeans builtin dml function >> - builtins for image augmentation >> >> Best, >> Mark >> >