thanks for reaching out Sagar. Let me first give the context and
subsequently some proposals to implement these things:
a) Context: When we removed the MR backend, in favor of Spark and new
backends, we also removed the UDF framework because it was only
implemented for MR/in-memory UDFs. Originally, we wanted to build a new
UDF framework (for Spark and other backends) but then realized there are
better ways via DML-bodied builtin functions, new primitives, and simple
custom lambda functions.
b) Alternatives:
* MultiInputCbind: this was mostly a performance feature for avoiding
allocations of "cbind(cbind(cbind(X1,X2),X3),X4)", but we now have
rewrites to internally rewrite such sequence to an nary
cbind(X1,X2,X3,X4) (and rbind equivalently) anyway. So one could either
write above sequence, or directly call the nary cbind via
"cbind(list(X1,X2,X3,X4))".
* CumSumProd: this functionality is meanwhile directly supported via
cumsumprod(Y) where Y=cbind(X,C). There is also a joint paper with
previous SystemML team members from IBM on it:
https://mboehm7.github.io/resources/btw2019.pdf
* RowClassMeet: This one is a bit tricky. First I thought I should
recommend our new lambda UDF functions (for custom cell/row/column
operations) via "map(F, "x -> ...", margin=1)", but I think the easiest
for you would be if we add this as a directly supported multi-return
function (like eigen()). I'll reintegrate this functionality into the
repository tomorrow morning.
Regards,
Matthias
On 6/5/2024 7:18 AM, Sagar Simon Gonsalves wrote:
Hi,
I’m working on moving our code base from SystemML 1.2.0 to SystemDS 3.2.0,
And for that I’m looking for three UDFs from SystemML that I couldn’t find in
SystemDS.
The UDF equivalent that I’m looking for in SystemDS are:
1. org.apache.sysml.udf.lib.RowClassMeet
2. org.apache.sysml.udf.lib.CumSumProd
3. org.apache.sysml.udf.lib.MultiInputCbind
Can you please let me know if SystemDS has any equivalent functions or UDFs
available?
Thanks,
Sagar Simon Gonsalves
Software Developer
Silicon Valley Lab
IBM