Hi Badrul,

Adding to this discussion,
I think we can start with what we already have implemented. We do not
need to implement every last function, we can choose a use-case based
approach for best results. I would start with the present status of
the builtins - they are enough for a lot of use cases! then implement
one by one based on priority. Most of our builtin functions other than
ML (including NN library) are inspired from R language.

During the implementation/testing, we might need to modify/could find
optimization opportunities for our system internals.

One of the approaches:
1. Take an algorithm/product that is already implemented in another
system/library.
2. Find places where SystemDS can perform better. Find the low hanging
fruit, like can we use one of our python builtins or a combination to
achieve similar or better results. and can we improve it further.
3. So, we identified a candidate for builtin.
4. and repeat the cycle.


Best regards,
Janardhan



On Tue, Aug 2, 2022 at 2:09 AM Badrul Chowdhury
<badrulchowdhur...@gmail.com> wrote:
>
> Hi,
>
> I wanted to start a discussion on building parity of built-in functions
> with popular OSS libraries. I am thinking of attaining parity as a 3-step
> process:
>
> *Step 1*
> As far as I can tell from the existing built-in functions, SystemDS aims to
> offer a hybrid set of APIs for scientific computing and ML (data
> engineering included) to users. Therefore, the most obvious OSS libraries
> for comparison would be numpy, sklearn (scipy), and pandas. Apache
> DataSketches would be another relevant system for specialized use cases
> (sketches).
>
> *Step 2*
> Once we have established a set of libraries, I would propose that we create
> a capability matrix with sections for each library, like so:
>
> Section 1: numpy
>
> f_1
>
> f_2
>
> [..]
>
>
> f_n
>
> Section 2: sklearn
>
> [..]
>
>
> The columns could be a checklist like this: f_i -> (DML, Python, CP, SP,
> RowCol, Row, Col, Federated, documentationPublished)
>
> *Step 3*
> Create JIRA tasks, assign them, and start coding.
>
>
> Thoughts?
>
>
> Thanks,
> Badrul

Reply via email to