Hi Stavros,

thanks for the detailed FLIP!
Model serving is an important use case and it's great to see efforts to add
a library for this to Flink!

I've read the FLIP and would like to ask a few questions and make some
suggestions.

1) Is it a strict requirement that a ML pipeline must be able to handle
different input types?
I understand that it makes sense to have different models for different
instances of the same type, i.e., same data type but different keys. Hence,
the key-based joins make sense to me. However, couldn't completely
different types be handled by different ML pipelines or would there be
major drawbacks?

2) I think from an API point of view it would be better to not require
input records to be encoded as ProtoBuf messages. Instead, the model server
could accept strongly-typed objects (Java/Scala) and (if necessary) convert
them to ProtoBuf messages internally. In case we need to support different
types of records (see my first point), we can introduce a Union type (i.e.,
an n-ary Either type). I see that we need some kind of binary encoding
format for the models but maybe also this can be designed to be pluggable
such that later other encodings can be added.

3) I think the DataStream Java API should be supported as a first class
citizens for this library.

4) For the integration with the DataStream API, we could provide an API
that receives (typed) DataStream objects, internally constructs the
DataStream operators, and returns one (or more) result DataStreams. The
benefit is that we don't need to change the DataStream API directly, but
put a library on top. The other libraries (CEP, Table, Gelly) follow this
approach.

5) I'm skeptical about using queryable state to expose metrics. Did you
consider using Flink's metrics system [1]? It is easily configurable and we
provided several reporters that export the metrics.

What do you think?
Best, Fabian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html

2017-11-23 12:32 GMT+01:00 Stavros Kontopoulos <st.kontopou...@gmail.com>:

> Hi guys,
>
> Let's discuss the new FLIP proposal for model serving over Flink. The idea
> is to combine previous efforts there and provide a library on top of Flink
> for serving models.
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-23+-+Model+Serving
>
> Code from previous efforts can be found here: https://github.com/FlinkML
>
> Best,
> Stavros
>

Reply via email to