Hi Stavros, thanks for the detailed FLIP! Model serving is an important use case and it's great to see efforts to add a library for this to Flink!
I've read the FLIP and would like to ask a few questions and make some suggestions. 1) Is it a strict requirement that a ML pipeline must be able to handle different input types? I understand that it makes sense to have different models for different instances of the same type, i.e., same data type but different keys. Hence, the key-based joins make sense to me. However, couldn't completely different types be handled by different ML pipelines or would there be major drawbacks? 2) I think from an API point of view it would be better to not require input records to be encoded as ProtoBuf messages. Instead, the model server could accept strongly-typed objects (Java/Scala) and (if necessary) convert them to ProtoBuf messages internally. In case we need to support different types of records (see my first point), we can introduce a Union type (i.e., an n-ary Either type). I see that we need some kind of binary encoding format for the models but maybe also this can be designed to be pluggable such that later other encodings can be added. 3) I think the DataStream Java API should be supported as a first class citizens for this library. 4) For the integration with the DataStream API, we could provide an API that receives (typed) DataStream objects, internally constructs the DataStream operators, and returns one (or more) result DataStreams. The benefit is that we don't need to change the DataStream API directly, but put a library on top. The other libraries (CEP, Table, Gelly) follow this approach. 5) I'm skeptical about using queryable state to expose metrics. Did you consider using Flink's metrics system [1]? It is easily configurable and we provided several reporters that export the metrics. What do you think? Best, Fabian [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/monitoring/metrics.html 2017-11-23 12:32 GMT+01:00 Stavros Kontopoulos <st.kontopou...@gmail.com>: > Hi guys, > > Let's discuss the new FLIP proposal for model serving over Flink. The idea > is to combine previous efforts there and provide a library on top of Flink > for serving models. > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-23+-+Model+Serving > > Code from previous efforts can be found here: https://github.com/FlinkML > > Best, > Stavros >