Re: [DISCUSS] Apache SystemDS 2.0 Release

2020-08-31 Thread Matthias Boehm
thanks Arnab for looking over the remaining open issues. Together with 
Shafaq, we just came across two additional bugs related to eval function 
calls. Theses fixes should go into the RC and I intend to fix them as 
soon as possible.


Regards,
Matthias

On 8/27/2020 8:41 PM, arnab phani wrote:

Hi All,

Currently, I see only a few issues are flagged for 2.0 release. Can you
please go through your open issues and check if the Fix-Version is set?
Also, if a JIRA task doesn't exist for something you are working on or want
to have in the coming release, please open a task and flag it for 2.0.

Regards,
Arnab..

On Thu, Aug 20, 2020 at 8:18 PM Matthias Boehm  wrote:


as the target release date end of August comes closer, I'd like to share
that Arnab Phani kindly volunteered in an offline discussion to act as
the release manager for our 2.0 release.

Please, flag issues and features you think are important for the 2.0
release as such in JIRA so we can monitor them, discuss them on a case
by case basis, and push the release date if necessary. Thanks.

Regards,
Matthias

On 8/17/2020 2:51 PM, Janardhan wrote:

Hi,

The following is the status of the MLContext test for algorithms.

1. l2svm, msvm, PCA - scripts are running + results are not equal to R
2. Autoencoder, StepwiseReg - Scripts are not running
3. KMeans, GLM (need to fix R) - No R script

Thank you,
Janardhan

On Fri, Jul 10, 2020 at 2:29 AM Matthias Boehm 

wrote:



thanks for the perspective, I think we should be very pragmatic
regarding languages. Let's stick to DML as our domain-specific language
with R-like syntax, but add language bindings such as the Python API
(and others) to seamlessly plug into common data science workflows. A
similar mind set worked very well in the internals too: Java for nicely
integrating with Hadoop/Spark and simplicity, but with C++ and CUDA
kernels and native libraries where necessary.

Regards,
Matthias

On 7/9/2020 3:54 PM, Janardhan wrote:

DML - %*% seems more Intuitive compared to @. Let us not change the

syntax

( our selling point easy porting to R! )
Python - no solid opinion

- Janardhan

On Thu, 9 Jul, 2020, 19:06 Matthias Boehm,  wrote:


for the Python API this is fine, for DML not as we should stick as

close

as possible to R syntax. Once we had a pydml syntax too, but this
created lots of inconsistencies and could not use Python as a host
language. So, I think restricting such changes to the Python API is a
good path forward. Other opinions?

Regards,
Matthias

On 7/9/2020 3:31 PM, Baunsgaard, Sebastian wrote:

Hi all


Can i suggest a radical change of matrix multiply.
to change the command from %*% to @.

Python has made this commitment!


https://www.python.org/dev/peps/pep-0465/


or at least change this in the python API?


Best regards

Sebastian


From: Matthias Boehm 
Sent: Wednesday, July 8, 2020 11:04:12 PM
To: dev@systemds.apache.org
Subject: [DISCUSS] Apache SystemDS 2.0 Release

Hi all,

I'd like to propose Aug 31 as a target date for the SystemDS 2.0

release

(feature freeze August 21). This should gives us enough time to

figure

out the list of things that still should go into this release as it's

an

opportunity of a major for changes of external behavior. However, as
it's the first SystemDS Apache release, I think we should still stick

to

Spark 2.x and Java 8 and consider upgrades of Spark and the JDK for
subsequent releases. So, what do you think and any major features

you'd

like to see complete for 2.0?

Regards,
Matthias















[GitHub] [systemds] kev-inn opened a new pull request #1046: [SYSTEMDS-2556,2560] Add federated Encoder impute support and improve Omit

2020-08-31 Thread GitBox


kev-inn opened a new pull request #1046:
URL: https://github.com/apache/systemds/pull/1046


   Adds support for federated execution for the final encoder 
`EncoderMVImpute`. This should finish support for federated transform 
operations (perf and improvements being TODO).
   
   ## `EncoderMVImput`
   
   Note that I removed quite a bit of code from `EncoderMVImpute` as it seems 
to not be in use at all, please confirm if this is fine.
   
   ## `EncoderOmit`
   
   I added the perf improvement of the TODO, since omit anyway had some 
problems which needed a fix.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org