Re: Druid and machine learning

2020-01-27 Thread Roman Leventov
I was thinking about model training at Druid indexing side and evaluation at Druid querying side. The advantage Druid has over Spark at querying is faster row filtering thanks to bitset indexes. But since model evaluation is a pretty heavy operation (I suppose; does anyone has ballpark time

Re: Druid and machine learning

2020-01-27 Thread Charles Allen
> it makes more sense to have tooling around Druid, to do slice and dice the data that you need, and do the ml stuff in sklearn, or even in spark I agree with this sentiment. Druid as an execution engine is very good at doing distributed aggregation (distributed reduce). What advantage does

Re: Druid and machine learning

2020-01-27 Thread Driesprong, Fokko
> Vertica has it. Good idea to introduce it in Druid. I'm not sure if this is a valid argument. With this argument, you can introduce anything into Druid. I think it is good to be opinionated, and as a community why we do or don't introduce ML possibilities into the software. For example,