We shall put this one for formal voting once a suitable description(s)
is found. :)

Terms:

1. ML System
a. Does it mean the way our compiler understands the code and optimizes
it for algorithms (includes, not just ML specific algorithms)?
b. Or is it about the ML algorithms?

2. System vs Platform
We seem to have preferred "system" over "platform"!

3. Big data or data science
the software works fine for small to big data - so big data may not be
relevant.

---
Names of other related (in objectives) projects?

1. TensorFlow - An end-to-end open source machine learning platform
TensorFlow sticks to ML pipeline[1]. Their pipeline roughly looks like this

ML metadata -> Data validation -> Transform -> ML training -> Model
analysis -> serving/deployment

2. H2o.ai - H2O is a fully open source, distributed in-memory machine
learning platform with linear scalability
Strikingly, most of the H2o functionality is similar to SystemML.
Pipeline:
Load data -> Exploratory data analysis and feature selection -> Modeling,
model evaluation, & selection -> prediction

3. MXNet - A flexible and efficient library for deep learning.
In their github description -  Lightweight, Portable, Flexible
Distributed/Mobile Deep Learning with
                                            Dynamic, Mutation-aware
Dataflow Dep Scheduler

[1] https://www.tensorflow.org/tfx
[2] https://www.h2o.ai/products/h2o/
[3] https://mxnet.apache.org/versions/1.8.0/

Thank you,
Janardhan


On Wed, May 19, 2021 at 12:06 AM Baunsgaard, Sebastian
<baunsga...@tugraz.at.invalid> wrote:

> +1 for : "Apache SystemDS - An open source ML system for the end-to-end
> data science lifecycle"
>
> The webpage have to be changed here:
>
> https://github.com/apache/systemds-website/blob/master/_src/_includes/themes/apache/home.html
>
> And in that process maybe going through the text on the main webpage would
> be good.
> for instance the first sentence describing systemds is:
>
> "Apache SystemDS provides an optimal workplace for machine learning using
> big data"
>
> I would also like to point out the graphical resources on the webpage
> still contain SystemML, therefore we should remove or replace them.
>
> Regards
> Sebastian
> ________________________________
> From: arnab phani <phaniar...@gmail.com>
> Sent: Tuesday, May 18, 2021 8:02:44 PM
> To: dev@systemds.apache.org
> Subject: Re: [DISCUSS] SystemDS project description
>
> I like  "Apache SystemDS - An open source ML system for the end-to-end data
> science lifecycle".
> Only thing is that "open source" sounds a bit redundant given that the name
> includes Apache.
> But at places where "Apache" is not mentioned (e.g. PyPI), this description
> is apt.
>
> Regards,
> Arnab..
>
> On Tue, May 18, 2021 at 7:53 PM Matthias Boehm <mboe...@gmail.com> wrote:
>
> > thanks for initiating this discussion and there are indeed a couple of
> > things we need to clean up. Just for the future, please ask before
> > adding even more to this diversity (I understand you just recently
> > changed the github summary proactively without such discussion).
> >
> > ad 1) DML stands for Declarative ML Language and it's design philosophy
> > is based on a declarative specification in terms of providing data
> > independence (abstract data types, no hard coding of
> > dense/sparse/compressed), and implementation-agnostic operations (no
> > hard-coding of local vs distributed vs federated vs HW accelerator
> > operations).
> >
> > ad 2) When merging SystemDS into Apache SytemDS, I changed the JIRA
> > summary to "Apache SystemDS - An open source ML system for the
> > end-to-end data science lifecycle" and I still like this best because we
> > want to have a stable name, independent of trends of underlying
> > execution models. As a side not I always disliked the phrase "A machine
> > learning platform optimal for big data" (use of optional, big data
> > wording). However, this is just my opinion, and I think it's a good
> > point to discuss this once and for all (for the foreseeable future at
> > least). Any thoughts?
> >
> > Regards,
> > Matthias
> >
> > On 5/18/2021 4:18 PM, Janardhan wrote:
> > > Hi all,
> > >
> > > We are using different descriptions at various places. It would be
> better
> > > to exemplify each term more clearly. Sorry, If I am asking something
> > > obvious.
> > >
> > > 1. Which one should we use as the project description?
> > > note: Although, description given in the SystemDS research paper can
> > > be considered - the paper was published before the Merge into SystemML.
> > >
> > > 2. Also, what is the full form of DML?
> > >      a. Declarative machine Learning Language
> > >      b. Descriptive Machine Learning Language
> > >      c. ..
> > >
> > > Research paper [1]:
> > > SystemDS: A Declarative Machine Learning System for the End-to-End Data
> > > Science Lifecycle
> > >
> > > GitHub
> > > Apache SystemDS - A versatile system for the end-to-end data science
> > > lifecycle
> > >
> > > PyPI
> > > SystemDS is a distributed and declarative machine learning platform.
> > >
> > > systemds.apache.org
> > > A machine learning platform optimal for big data
> > >
> > > Jira
> > > Apache SystemDS - An open source ML system for the end-to-end data
> > science
> > > lifecycle
> > >
> > > ---
> > > SystemDS game plan [1] major parts:
> > >
> > > 1. DSL-based, High-level Abstractions: We aim to provide a hierarchy of
> > > abstractions for the different lifecycle tasks as well as users with
> > > different expertise
> > >
> > > 2. Hybrid Runtime Plans and Optimizing Compiler: To support the wide
> > > variety of algorithm classes, we will continue to provide different
> > > parallelization strategies, enriched by a new backend for federated ML
> > > and privacy enhancing technologies.
> > >
> > > 3. Data Model - Heterogeneous Tensors: To support data integration and
> > > cleaning primitives in linear algebra programs requires a more generic
> > > data model for handling heterogeneous and structured data. In contrast
> to
> > > existing ML systems, our central data models are heterogeneous tensors.
> > >
> > > [1] https://arxiv.org/abs/1909.02976
> > > [2] Roadmap discussion - https://s.apache.org/systemds-roadmap
> > >
> > > Thank you,
> > > Janardhan
> > >
> >
>

Reply via email to