Hi Badrul, Adding to this discussion, I think we can start with what we already have implemented. We do not need to implement every last function, we can choose a use-case based approach for best results. I would start with the present status of the builtins - they are enough for a lot of use cases! then implement one by one based on priority. Most of our builtin functions other than ML (including NN library) are inspired from R language.
During the implementation/testing, we might need to modify/could find optimization opportunities for our system internals. One of the approaches: 1. Take an algorithm/product that is already implemented in another system/library. 2. Find places where SystemDS can perform better. Find the low hanging fruit, like can we use one of our python builtins or a combination to achieve similar or better results. and can we improve it further. 3. So, we identified a candidate for builtin. 4. and repeat the cycle. Best regards, Janardhan On Tue, Aug 2, 2022 at 2:09 AM Badrul Chowdhury <badrulchowdhur...@gmail.com> wrote: > > Hi, > > I wanted to start a discussion on building parity of built-in functions > with popular OSS libraries. I am thinking of attaining parity as a 3-step > process: > > *Step 1* > As far as I can tell from the existing built-in functions, SystemDS aims to > offer a hybrid set of APIs for scientific computing and ML (data > engineering included) to users. Therefore, the most obvious OSS libraries > for comparison would be numpy, sklearn (scipy), and pandas. Apache > DataSketches would be another relevant system for specialized use cases > (sketches). > > *Step 2* > Once we have established a set of libraries, I would propose that we create > a capability matrix with sections for each library, like so: > > Section 1: numpy > > f_1 > > f_2 > > [..] > > > f_n > > Section 2: sklearn > > [..] > > > The columns could be a checklist like this: f_i -> (DML, Python, CP, SP, > RowCol, Row, Col, Federated, documentationPublished) > > *Step 3* > Create JIRA tasks, assign them, and start coding. > > > Thoughts? > > > Thanks, > Badrul