Hi, here are the changes I contributed after March 24:
- Added SystemDSContext to python api (now necessary for operations) - Added federated frames - Federated transform-encode, -decode and -apply (missing value imputation is still an ongoing PR, I think it will be merged in before release) - New builtin `colnames()` to get the column names of a frame That should be everything from my side. Regards, Kevin On 9/1/20 11:36 AM, arnab phani wrote: > Hi All, > > As we are nearing the release, I am starting to focus on the release notes. > Notes for SystemDS 2.0 release should consolidate all the things that > happened since Aug 2018 (last SystemML release). > While I will aggregate the notes from two SystemDS releases, it will be > great if you can update me with a few lines summarizing the additions to > your features (including the external contributions), especially after > March 24, 2020 (last SystemDS release). > > Once ready, I will share for everyone to have a look. > > Regards, > Arnab.. > > On Mon, Aug 31, 2020 at 8:34 PM Matthias Boehm <[email protected]> wrote: > >> thanks Arnab for looking over the remaining open issues. Together with >> Shafaq, we just came across two additional bugs related to eval function >> calls. Theses fixes should go into the RC and I intend to fix them as >> soon as possible. >> >> Regards, >> Matthias >> >> On 8/27/2020 8:41 PM, arnab phani wrote: >>> Hi All, >>> >>> Currently, I see only a few issues are flagged for 2.0 release. Can you >>> please go through your open issues and check if the Fix-Version is set? >>> Also, if a JIRA task doesn't exist for something you are working on or >> want >>> to have in the coming release, please open a task and flag it for 2.0. >>> >>> Regards, >>> Arnab.. >>> >>> On Thu, Aug 20, 2020 at 8:18 PM Matthias Boehm <[email protected]> >> wrote: >>>> as the target release date end of August comes closer, I'd like to share >>>> that Arnab Phani kindly volunteered in an offline discussion to act as >>>> the release manager for our 2.0 release. >>>> >>>> Please, flag issues and features you think are important for the 2.0 >>>> release as such in JIRA so we can monitor them, discuss them on a case >>>> by case basis, and push the release date if necessary. Thanks. >>>> >>>> Regards, >>>> Matthias >>>> >>>> On 8/17/2020 2:51 PM, Janardhan wrote: >>>>> Hi, >>>>> >>>>> The following is the status of the MLContext test for algorithms. >>>>> >>>>> 1. l2svm, msvm, PCA - scripts are running + results are not equal to R >>>>> 2. Autoencoder, StepwiseReg - Scripts are not running >>>>> 3. KMeans, GLM (need to fix R) - No R script >>>>> >>>>> Thank you, >>>>> Janardhan >>>>> >>>>> On Fri, Jul 10, 2020 at 2:29 AM Matthias Boehm <[email protected]> >>>> wrote: >>>>>> thanks for the perspective, I think we should be very pragmatic >>>>>> regarding languages. Let's stick to DML as our domain-specific >> language >>>>>> with R-like syntax, but add language bindings such as the Python API >>>>>> (and others) to seamlessly plug into common data science workflows. A >>>>>> similar mind set worked very well in the internals too: Java for >> nicely >>>>>> integrating with Hadoop/Spark and simplicity, but with C++ and CUDA >>>>>> kernels and native libraries where necessary. >>>>>> >>>>>> Regards, >>>>>> Matthias >>>>>> >>>>>> On 7/9/2020 3:54 PM, Janardhan wrote: >>>>>>> DML - %*% seems more Intuitive compared to @. Let us not change the >>>>>> syntax >>>>>>> ( our selling point easy porting to R! ) >>>>>>> Python - no solid opinion >>>>>>> >>>>>>> - Janardhan >>>>>>> >>>>>>> On Thu, 9 Jul, 2020, 19:06 Matthias Boehm, <[email protected]> >> wrote: >>>>>>>> for the Python API this is fine, for DML not as we should stick as >>>> close >>>>>>>> as possible to R syntax. Once we had a pydml syntax too, but this >>>>>>>> created lots of inconsistencies and could not use Python as a host >>>>>>>> language. So, I think restricting such changes to the Python API is >> a >>>>>>>> good path forward. Other opinions? >>>>>>>> >>>>>>>> Regards, >>>>>>>> Matthias >>>>>>>> >>>>>>>> On 7/9/2020 3:31 PM, Baunsgaard, Sebastian wrote: >>>>>>>>> Hi all >>>>>>>>> >>>>>>>>> >>>>>>>>> Can i suggest a radical change of matrix multiply. >>>>>>>>> to change the command from %*% to @. >>>>>>>>> >>>>>>>>> Python has made this commitment! >>>>>>>>> >>>>>>>>> >>>>>>>>> https://www.python.org/dev/peps/pep-0465/ >>>>>>>>> >>>>>>>>> >>>>>>>>> or at least change this in the python API? >>>>>>>>> >>>>>>>>> >>>>>>>>> Best regards >>>>>>>>> >>>>>>>>> Sebastian >>>>>>>>> >>>>>>>>> ________________________________ >>>>>>>>> From: Matthias Boehm <[email protected]> >>>>>>>>> Sent: Wednesday, July 8, 2020 11:04:12 PM >>>>>>>>> To: [email protected] >>>>>>>>> Subject: [DISCUSS] Apache SystemDS 2.0 Release >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I'd like to propose Aug 31 as a target date for the SystemDS 2.0 >>>>>> release >>>>>>>>> (feature freeze August 21). This should gives us enough time to >>>> figure >>>>>>>>> out the list of things that still should go into this release as >> it's >>>>>> an >>>>>>>>> opportunity of a major for changes of external behavior. However, >> as >>>>>>>>> it's the first SystemDS Apache release, I think we should still >> stick >>>>>> to >>>>>>>>> Spark 2.x and Java 8 and consider upgrades of Spark and the JDK for >>>>>>>>> subsequent releases. So, what do you think and any major features >>>> you'd >>>>>>>>> like to see complete for 2.0? >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Matthias >>>>>>>>>
