Hello Roman,

Thank you for bringing this topic. I think it's a great idea to
conceptualize how ML and Druid can be tightened together.

What I think could be one of the beneficiary approach for ML training- easy
and intuitive integration with ML ecosystem. Easy access druid data sources
from python in general and pandas in specifically, along with Jupiter
notebooks and other ML popular projects.

However, I don't see how models can be incorporated into Druid. Unlike
Spark or Flink, Druid is not designed for execution user-programmable code.
>From my perspective, trying to execute some ML logic on the druid side will
be similar to the "stored procedures" approach which most likely hurt
scalability. However, please don't take my point seriously here, I'm not an
in-depth expert with Druid.

Best,
Sayat



On Fri, Jan 10, 2020 at 6:41 AM Roman Leventov <leventov...@gmail.com>
wrote:

> Hello Druid developers, what do you think about the future of Druid &
> machine learning?
>
> Druid has been great at complex aggregations. Could (should?) It make
> inroads into ML? Perhaps aggregators which apply the rows against some
> pre-trained model and summarize results.
>
> Should model training stay completely external to Druid, or it could be
> incorporated into Druid's data lifecycle on a conceptual level, such as a
> recurring "indexing" task which stores the result (the model) in Druid's
> deep storage, the model automatically loaded on historical nodes as needed
> (just like segments) and certain aggregators pick up the latest model?
>
> Does this make any sense? In what cases Druid & ML will and will not work
> well together, and ML should stay a Spark's prerogative?
>
> I would be very interested to hear any thoughts on the topic, vague ideas
> and questions.
>


-- 
Best Regards,
Sayat

Reply via email to