Re: ML and Stream

Christophe Jolif Mon, 05 Feb 2018 09:29:26 -0800

Fabian,

Ok thanks for the update. Meanwhile I was looking at how I could still
leverage current FlinkML API, but as far as I can see, it misses the
ability of being able to persist its own models? So even for pure batch it
prevents running your (once built) model in several jobs? Or am I missing
something?


I suspect I should not be the only one that would love to apply machine
learning as part of a Flink Processing? Waiting for FLIP-23 what are the
"best" practices today?

Thanks again for your help,
--
Christophe

On Mon, Feb 5, 2018 at 6:01 PM, Fabian Hueske <fhue...@gmail.com> wrote:

> Hi Christophe,
>
> it is true that FlinkML only targets batch workloads. Also, there has not
> been any development since a long time.
>
> In March last year, a discussion was started on the dev mailing list about
> different machine learning features for stream processing [1].
> One result of this discussion was FLIP-23 [2] which will add a library for
> model serving to Flink, i.e., it can load (and update) machine learning
> models and evaluate them on a stream.
> If you dig through the mailing list thread, you'll find a link to a Google
> doc that discusses other possible directions.
>
> Best, Fabian
>
> [1] https://lists.apache.org/thread.html/eeb80481f3723c160bc923d689416a
> 352d6df4aad98fe7424bf33132@%3Cdev.flink.apache.org%3E
> [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-
> 23+-+Model+Serving
>
> 2018-02-05 16:43 GMT+01:00 Christophe Jolif <cjo...@gmail.com>:
>
>> Hi all,
>>
>> Sorry, this is me again with another question.
>>
>> Maybe I did not search deep enough, but it seems the FlinkML API is still
>> pure batch.
>>
>> If I read https://cwiki.apache.org/confluence/display/FLINK/Flink
>> ML%3A+Vision+and+Roadmap it seems there was the intend to "exploit the
>> streaming nature of Flink, and provide functionality designed
>> specifically for data streams" but from my external point of view, I don't
>> see much happening here. Is there work in progress towards that?
>>
>> I would personally see two use-cases around streaming, first one around
>> updating an existing model that was build in batch, second one would be
>> triggering prediction not through a batch job but in a stream job.
>>
>> Are these things that are in the works? or maybe already feasible despite
>> the API looking like purely batch branded?
>>
>>
>>

Re: ML and Stream

Reply via email to