thanks for initiating this discussion and there are indeed a couple of things we need to clean up. Just for the future, please ask before adding even more to this diversity (I understand you just recently changed the github summary proactively without such discussion).

ad 1) DML stands for Declarative ML Language and it's design philosophy is based on a declarative specification in terms of providing data independence (abstract data types, no hard coding of dense/sparse/compressed), and implementation-agnostic operations (no hard-coding of local vs distributed vs federated vs HW accelerator operations).

ad 2) When merging SystemDS into Apache SytemDS, I changed the JIRA summary to "Apache SystemDS - An open source ML system for the end-to-end data science lifecycle" and I still like this best because we want to have a stable name, independent of trends of underlying execution models. As a side not I always disliked the phrase "A machine learning platform optimal for big data" (use of optional, big data wording). However, this is just my opinion, and I think it's a good point to discuss this once and for all (for the foreseeable future at least). Any thoughts?

Regards,
Matthias

On 5/18/2021 4:18 PM, Janardhan wrote:
Hi all,

We are using different descriptions at various places. It would be better
to exemplify each term more clearly. Sorry, If I am asking something
obvious.

1. Which one should we use as the project description?
note: Although, description given in the SystemDS research paper can
be considered - the paper was published before the Merge into SystemML.

2. Also, what is the full form of DML?
     a. Declarative machine Learning Language
     b. Descriptive Machine Learning Language
     c. ..

Research paper [1]:
SystemDS: A Declarative Machine Learning System for the End-to-End Data
Science Lifecycle

GitHub
Apache SystemDS - A versatile system for the end-to-end data science
lifecycle

PyPI
SystemDS is a distributed and declarative machine learning platform.

systemds.apache.org
A machine learning platform optimal for big data

Jira
Apache SystemDS - An open source ML system for the end-to-end data science
lifecycle

---
SystemDS game plan [1] major parts:

1. DSL-based, High-level Abstractions: We aim to provide a hierarchy of
abstractions for the different lifecycle tasks as well as users with
different expertise

2. Hybrid Runtime Plans and Optimizing Compiler: To support the wide
variety of algorithm classes, we will continue to provide different
parallelization strategies, enriched by a new backend for federated ML
and privacy enhancing technologies.

3. Data Model - Heterogeneous Tensors: To support data integration and
cleaning primitives in linear algebra programs requires a more generic
data model for handling heterogeneous and structured data. In contrast to
existing ML systems, our central data models are heterogeneous tensors.

[1] https://arxiv.org/abs/1909.02976
[2] Roadmap discussion - https://s.apache.org/systemds-roadmap

Thank you,
Janardhan

Reply via email to