+1 for : "Apache SystemDS - An open source ML system for the end-to-end data science lifecycle"
The webpage have to be changed here: https://github.com/apache/systemds-website/blob/master/_src/_includes/themes/apache/home.html And in that process maybe going through the text on the main webpage would be good. for instance the first sentence describing systemds is: "Apache SystemDS provides an optimal workplace for machine learning using big data" I would also like to point out the graphical resources on the webpage still contain SystemML, therefore we should remove or replace them. Regards Sebastian ________________________________ From: arnab phani <phaniar...@gmail.com> Sent: Tuesday, May 18, 2021 8:02:44 PM To: dev@systemds.apache.org Subject: Re: [DISCUSS] SystemDS project description I like "Apache SystemDS - An open source ML system for the end-to-end data science lifecycle". Only thing is that "open source" sounds a bit redundant given that the name includes Apache. But at places where "Apache" is not mentioned (e.g. PyPI), this description is apt. Regards, Arnab.. On Tue, May 18, 2021 at 7:53 PM Matthias Boehm <mboe...@gmail.com> wrote: > thanks for initiating this discussion and there are indeed a couple of > things we need to clean up. Just for the future, please ask before > adding even more to this diversity (I understand you just recently > changed the github summary proactively without such discussion). > > ad 1) DML stands for Declarative ML Language and it's design philosophy > is based on a declarative specification in terms of providing data > independence (abstract data types, no hard coding of > dense/sparse/compressed), and implementation-agnostic operations (no > hard-coding of local vs distributed vs federated vs HW accelerator > operations). > > ad 2) When merging SystemDS into Apache SytemDS, I changed the JIRA > summary to "Apache SystemDS - An open source ML system for the > end-to-end data science lifecycle" and I still like this best because we > want to have a stable name, independent of trends of underlying > execution models. As a side not I always disliked the phrase "A machine > learning platform optimal for big data" (use of optional, big data > wording). However, this is just my opinion, and I think it's a good > point to discuss this once and for all (for the foreseeable future at > least). Any thoughts? > > Regards, > Matthias > > On 5/18/2021 4:18 PM, Janardhan wrote: > > Hi all, > > > > We are using different descriptions at various places. It would be better > > to exemplify each term more clearly. Sorry, If I am asking something > > obvious. > > > > 1. Which one should we use as the project description? > > note: Although, description given in the SystemDS research paper can > > be considered - the paper was published before the Merge into SystemML. > > > > 2. Also, what is the full form of DML? > > a. Declarative machine Learning Language > > b. Descriptive Machine Learning Language > > c. .. > > > > Research paper [1]: > > SystemDS: A Declarative Machine Learning System for the End-to-End Data > > Science Lifecycle > > > > GitHub > > Apache SystemDS - A versatile system for the end-to-end data science > > lifecycle > > > > PyPI > > SystemDS is a distributed and declarative machine learning platform. > > > > systemds.apache.org > > A machine learning platform optimal for big data > > > > Jira > > Apache SystemDS - An open source ML system for the end-to-end data > science > > lifecycle > > > > --- > > SystemDS game plan [1] major parts: > > > > 1. DSL-based, High-level Abstractions: We aim to provide a hierarchy of > > abstractions for the different lifecycle tasks as well as users with > > different expertise > > > > 2. Hybrid Runtime Plans and Optimizing Compiler: To support the wide > > variety of algorithm classes, we will continue to provide different > > parallelization strategies, enriched by a new backend for federated ML > > and privacy enhancing technologies. > > > > 3. Data Model - Heterogeneous Tensors: To support data integration and > > cleaning primitives in linear algebra programs requires a more generic > > data model for handling heterogeneous and structured data. In contrast to > > existing ML systems, our central data models are heterogeneous tensors. > > > > [1] https://arxiv.org/abs/1909.02976 > > [2] Roadmap discussion - https://s.apache.org/systemds-roadmap > > > > Thank you, > > Janardhan > > >