Or use MLFlow's PySpark UDF. First create a mlflow.pyfunc. Best, Ravion
On Sun, Sep 6, 2020, 9:43 AM ☼ R Nair <ravishankar.n...@gmail.com> wrote: > Question is not clear..use accumulators, if I took it correctly. > > Best, Ravion > > On Sun, Sep 6, 2020, 9:41 AM Ankur Das <dasankur...@gmail.com> wrote: > >> >> Good Evening Sir/Madam, >> Hope you are doing well, I am experimenting on some ML techniques where I >> need to test it on a distributed environment. >> For example a particular algorithm I want to run it on different nodes at >> the same time and collect the results at the end in one single node or the >> parent node. >> >> So, I would like to know if it is possible or a good choice to use spark >> for this. >> >> Hope to hear from you soon, Stay safe and healthy >> Thanking you in advance. >> -- >> Regards, >> Ankur J Das >> Research Scholar @ Tezpur University >> Tezpur, Assam >> >