Hi Arnab,

Here is my list, feel free to remove elements 😊

Major:

- Refactor Compression package and add functions
  - add Quanization for lossy compression
  - Generalize column groups to use same base dictionary
  - Binary cell operations
  - Left Matrix Multiplication
- GitHub actions for automated testing
- Improved Compile times, and packaging
- Docker containers for systemds, pythonsystemds and testingsystemds

Minor:

- python PCA and MultiLogReg algorithms
- parallel sort
- parallel detect schema
- Url handler for federated
- Distinct values count / estimation function
- Simplified Log4J from being Hadoop based to our own
- Handle NaStrings in CSV reading frame and matrix
- Re-enable code coverage tools

Removed

- GitHub pages, for documentation and moved to master
- Travis testing


Best regards

Sebastian

________________________________
From: arnab phani <phaniar...@gmail.com>
Sent: Monday, September 7, 2020 9:26:12 AM
To: dev@systemds.apache.org
Subject: Re: [DISCUSS] Apache SystemDS 2.0 Release

Thanks Kevin.

Other committers: once you get a chance, please send me your contributions
too.

Regards,
Arnab..

On Wed, Sep 2, 2020 at 10:04 PM Kevin Innerebner <
innereb...@student.tugraz.at> wrote:

> Hi,
>
> here are the changes I contributed after March 24:
>
> - Added SystemDSContext to python api (now necessary for operations)
>
> - Added federated frames
>
> - Federated transform-encode, -decode and -apply (missing value
> imputation is still an ongoing PR, I think it will be merged in before
> release)
>
> - New builtin `colnames()` to get the column names of a frame
>
> That should be everything from my side.
>
> Regards,
> Kevin
>
> On 9/1/20 11:36 AM, arnab phani wrote:
> > Hi All,
> >
> > As we are nearing the release, I am starting to focus on the release
> notes.
> > Notes for SystemDS 2.0 release should consolidate all the things that
> > happened since Aug 2018 (last SystemML release).
> > While I will aggregate the notes from two SystemDS releases, it will be
> > great if you can update me with a few lines summarizing the additions to
> > your features (including the external contributions), especially after
> > March 24, 2020 (last SystemDS release).
> >
> > Once ready, I will share for everyone to have a look.
> >
> > Regards,
> > Arnab..
> >
> > On Mon, Aug 31, 2020 at 8:34 PM Matthias Boehm <mboe...@gmail.com>
> wrote:
> >
> >> thanks Arnab for looking over the remaining open issues. Together with
> >> Shafaq, we just came across two additional bugs related to eval function
> >> calls. Theses fixes should go into the RC and I intend to fix them as
> >> soon as possible.
> >>
> >> Regards,
> >> Matthias
> >>
> >> On 8/27/2020 8:41 PM, arnab phani wrote:
> >>> Hi All,
> >>>
> >>> Currently, I see only a few issues are flagged for 2.0 release. Can you
> >>> please go through your open issues and check if the Fix-Version is set?
> >>> Also, if a JIRA task doesn't exist for something you are working on or
> >> want
> >>> to have in the coming release, please open a task and flag it for 2.0.
> >>>
> >>> Regards,
> >>> Arnab..
> >>>
> >>> On Thu, Aug 20, 2020 at 8:18 PM Matthias Boehm <mboe...@gmail.com>
> >> wrote:
> >>>> as the target release date end of August comes closer, I'd like to
> share
> >>>> that Arnab Phani kindly volunteered in an offline discussion to act as
> >>>> the release manager for our 2.0 release.
> >>>>
> >>>> Please, flag issues and features you think are important for the 2.0
> >>>> release as such in JIRA so we can monitor them, discuss them on a case
> >>>> by case basis, and push the release date if necessary. Thanks.
> >>>>
> >>>> Regards,
> >>>> Matthias
> >>>>
> >>>> On 8/17/2020 2:51 PM, Janardhan wrote:
> >>>>> Hi,
> >>>>>
> >>>>> The following is the status of the MLContext test for algorithms.
> >>>>>
> >>>>> 1. l2svm, msvm, PCA - scripts are running + results are not equal to
> R
> >>>>> 2. Autoencoder, StepwiseReg - Scripts are not running
> >>>>> 3. KMeans, GLM (need to fix R) - No R script
> >>>>>
> >>>>> Thank you,
> >>>>> Janardhan
> >>>>>
> >>>>> On Fri, Jul 10, 2020 at 2:29 AM Matthias Boehm <mboe...@gmail.com>
> >>>> wrote:
> >>>>>> thanks for the perspective, I think we should be very pragmatic
> >>>>>> regarding languages. Let's stick to DML as our domain-specific
> >> language
> >>>>>> with R-like syntax, but add language bindings such as the Python API
> >>>>>> (and others) to seamlessly plug into common data science workflows.
> A
> >>>>>> similar mind set worked very well in the internals too: Java for
> >> nicely
> >>>>>> integrating with Hadoop/Spark and simplicity, but with C++ and CUDA
> >>>>>> kernels and native libraries where necessary.
> >>>>>>
> >>>>>> Regards,
> >>>>>> Matthias
> >>>>>>
> >>>>>> On 7/9/2020 3:54 PM, Janardhan wrote:
> >>>>>>> DML - %*% seems more Intuitive compared to @. Let us not change the
> >>>>>> syntax
> >>>>>>> ( our selling point easy porting to R! )
> >>>>>>> Python - no solid opinion
> >>>>>>>
> >>>>>>> - Janardhan
> >>>>>>>
> >>>>>>> On Thu, 9 Jul, 2020, 19:06 Matthias Boehm, <mboe...@gmail.com>
> >> wrote:
> >>>>>>>> for the Python API this is fine, for DML not as we should stick as
> >>>> close
> >>>>>>>> as possible to R syntax. Once we had a pydml syntax too, but this
> >>>>>>>> created lots of inconsistencies and could not use Python as a host
> >>>>>>>> language. So, I think restricting such changes to the Python API
> is
> >> a
> >>>>>>>> good path forward. Other opinions?
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Matthias
> >>>>>>>>
> >>>>>>>> On 7/9/2020 3:31 PM, Baunsgaard, Sebastian wrote:
> >>>>>>>>> Hi all
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Can i suggest a radical change of matrix multiply.
> >>>>>>>>> to change the command from %*% to @.
> >>>>>>>>>
> >>>>>>>>> Python has made this commitment!
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> https://www.python.org/dev/peps/pep-0465/
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> or at least change this in the python API?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Best regards
> >>>>>>>>>
> >>>>>>>>> Sebastian
> >>>>>>>>>
> >>>>>>>>> ________________________________
> >>>>>>>>> From: Matthias Boehm <mboe...@gmail.com>
> >>>>>>>>> Sent: Wednesday, July 8, 2020 11:04:12 PM
> >>>>>>>>> To: dev@systemds.apache.org
> >>>>>>>>> Subject: [DISCUSS] Apache SystemDS 2.0 Release
> >>>>>>>>>
> >>>>>>>>> Hi all,
> >>>>>>>>>
> >>>>>>>>> I'd like to propose Aug 31 as a target date for the SystemDS 2.0
> >>>>>> release
> >>>>>>>>> (feature freeze August 21). This should gives us enough time to
> >>>> figure
> >>>>>>>>> out the list of things that still should go into this release as
> >> it's
> >>>>>> an
> >>>>>>>>> opportunity of a major for changes of external behavior. However,
> >> as
> >>>>>>>>> it's the first SystemDS Apache release, I think we should still
> >> stick
> >>>>>> to
> >>>>>>>>> Spark 2.x and Java 8 and consider upgrades of Spark and the JDK
> for
> >>>>>>>>> subsequent releases. So, what do you think and any major features
> >>>> you'd
> >>>>>>>>> like to see complete for 2.0?
> >>>>>>>>>
> >>>>>>>>> Regards,
> >>>>>>>>> Matthias
> >>>>>>>>>
>

Reply via email to