Hi All, Thanks for fixing the remaining issues. I will cut the first release candidates later in the afternoon (CET zone) today.
Regards, Arnab.. On Tue, Sep 22, 2020 at 2:53 PM arnab phani <phaniar...@gmail.com> wrote: > Hi All, > > Thanks for fixing the bugs. > A few tasks/bugs (see below) are still being worked on and hopefully will > be closed in a couple of days. > And we should be all ready to distribute the release candidates by the end > of this week. > > - python tutorials, > - failing gpu tests, > - error in loading native BLAS. > > Regards, > Arnab.. > > On Thu, Sep 10, 2020 at 11:24 AM arnab phani <phaniar...@gmail.com> wrote: > >> Thank you all for the notes. >> Please find the consolidated release notes below, and please let me know >> if anything major is missing. >> >> *Release notes for SystemDS 2.0.* >> >> SystemDS 2.0 is the first major release under the new name. This release >> contains a major refactoring, a few major features, a large number of >> improvements and fixes, and some experimental features to better support >> the end-to-end data science lifecycle. In addition to that, this release >> also removes several features that are not up to the mark and outdated. >> >> The major changes (compared to SystemML 1.2) include >> >> >> - New mechanism for DML-bodied (script-level) builtin functions, and >> a wealth of new built-in functions for data preprocessing including data >> cleaning, augmentation and feature engineering techniques, new ML >> algorithms, and model debugging. >> - Several methods for data cleaning have been implemented including >> multiple imputations with multivariate imputation by chained equations >> (MICE) and other techniques, SMOTE, an oversampling technique for class >> imbalance, forward and backward NA filling, cleaning using schema and >> length information, support for outlier detection using standard deviation >> and inter-quartile range, and functional dependency discovery. >> - A complete framework for lineage tracing and reuse including >> support for loop deduplication, full and partial reuse, compiler assisted >> reuse, several new rewrites to facilitate reuse. >> - New federated runtime backend including support for federated >> matrices and frames, federated builtins (transform-encode, decode etc.). >> - Refactor compression package and add functionalities including >> quantization for lossy compression, binary cell operations, left matrix >> multiplication. >> - New python bindings with supports for several builtins, matrix >> operations, federated tensors, and lineage traces. >> - Cuda implementation of cumulative aggregate operators (cumsum, >> cumprod etc.) >> - New model debugging technique with slice finder. >> - New tensor data model (basic tensors of different value types, data >> tensors with schema) [experimental] >> - Cloud deployment scripts for AWS and scripts to set up and start >> federated operations. >> - Performance improvements with parallel sort, gpu cum agg, append >> cbind etc. >> - Various compiler and runtime improvements including new and >> improved rewrites, reduced Spark context creation, new eval framework, >> list >> operations, updated native kernel libraries to name a few. >> - New data reader/writer for json frames and support for sql as a >> data source. >> - Miscellaneous improvements: improved documentation, better >> testing, run/release scripts, improved packaging, Docker container for >> systemds, bug fixes. >> - Removed MapReduce compiler and runtime backend, pydml parser, >> Java-UDF framework, script-level debugger. >> >> >> Regards, >> Arnab. >> >> >> On Tue, Sep 8, 2020 at 4:10 AM Mark Dokter <mdok...@know-center.at> >> wrote: >> >>> On 01.09.20 11:36, arnab phani wrote: >>> > While I will aggregate the notes from two SystemDS releases, it will be >>> > great if you can update me with a few lines summarizing the additions >>> to >>> > your features (including the external contributions), especially after >>> > March 24, 2020 (last SystemDS release). >>> >>> Hi Arnab! >>> >>> My contributions: >>> >>> - new run script >>> - improve/simplify release scripts >>> - various release related things (improve documentation, fix license >>> headers, clean up pom.xml, etc) >>> - cuda implementation of cumulative aggregate operators (cumsum, >>> cumprod, etc) >>> - bug fixes here and there >>> - maintain native blas support in a working state (now also supporting >>> windows) >>> - kmeans builtin dml function >>> - builtins for image augmentation >>> >>> Best, >>> Mark >>> >>