Re: [DISCUSS] FLIP-23 Model Serving

Fabian Hueske Tue, 30 Jan 2018 15:08:44 -0800

OK, I think there was plenty of time to comment on this FLIP.
I'll move it to the ACCEPTED status.


@Stavros, please consider the feedback regarding the metrics.
I agree with Chesnay that metrics should be primarily exposed via the
metrics system.
Storing them in state makes them fault-tolerant and queryable if the state
is properly configured.

Thanks,
Fabian

2018-01-22 17:19 GMT+01:00 Chesnay Schepler <ches...@apache.org>:

> I'm currently looking over it, but one thing that stood out was that the
> FLIP proposes to use queryable state
> as a monitoring solution. Given that we have a metric system that
> integrates with plenty of commonly used
> metric backends this doesn't really make sense to me.
>
> Storing them in state still has value in terms of fault-tolerance though,
> since this is something that the metric
> system doesn't provide by itself.
>
>
> On 18.01.2018 13:57, Fabian Hueske wrote:
>
>> Are there any more comments on the FLIP?
>>
>> Otherwise, I'd suggest to move the FLIP to the accepted FLIPs [1] and
>> continue with the implementation.
>>
>> Also, is there a committer who'd like to shepherd the FLIP and review the
>> corresponding PRs?
>> Of course, everybody is welcome to review the code but we need at least
>> one
>> committer who will eventually merge the changes.
>>
>> Best,
>> Fabian
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/Flink+
>> Improvement+Proposals
>>
>> 2017-12-04 10:54 GMT+01:00 Fabian Hueske <fhue...@gmail.com>:
>>
>> Hi,
>>>
>>> Sorry for the late follow up.
>>>
>>> I think I understand the motivation for choosing ProtoBuf as the
>>> representation and serialization format and this makes sense to me.
>>>
>>> However, it might be a good idea to provide tooling to convert Flink
>>> types
>>> (described as TypeInformation) to ProtoBuf.
>>> Otherwise, users of the model serving library would need to manually
>>> convert their data types (say Scala tuples, case classes, or Avro Pojos)
>>> to
>>> ProtoBuf messages.
>>> I don't think that this needs to be included in the first version but it
>>> might be a good extension to make the library easier to use.
>>>
>>> Best,
>>> Fabian
>>>
>>>
>>>
>>> 2017-11-28 17:22 GMT+01:00 Boris Lublinsky <
>>> boris.lublin...@lightbend.com>
>>> :
>>>
>>> Thanks Fabian,
>>>> More below
>>>>
>>>>
>>>>
>>>> Boris Lublinsky
>>>> FDP Architect
>>>> boris.lublin...@lightbend.com
>>>> https://www.lightbend.com/
>>>>
>>>> On Nov 28, 2017, at 8:21 AM, Fabian Hueske <fhue...@gmail.com> wrote:
>>>>
>>>> Hi Boris and Stavros,
>>>>
>>>> Thanks for the responses.
>>>>
>>>> Ad 1) Thanks for the clarification. I think I misunderstood this part of
>>>> the proposal.
>>>> I interpreted the argument why to chose ProtoBuf for network encoding
>>>> ("ability
>>>> to represent different data types") such that different a model pipeline
>>>> should work on different data types.
>>>> I agree that it should be possible to give records of the same type (but
>>>> with different keys) to different models. The key-based join approach
>>>> looks
>>>> good to me.
>>>>
>>>> Ad 2) I understand that ProtoBuf is a good choice to serialize models
>>>> for
>>>> the given reasons.
>>>> However, the choice of ProtoBuf serialization for the records might make
>>>> the integration with existing libraries and also regular DataStream
>>>> programs more difficult.
>>>> They all use Flink's TypeSerializer system to serialize and deserialize
>>>> records by default. Hence, we would need to add a conversion step before
>>>> records can be passed to a model serving operator.
>>>> Are you expecting some common format that all records follow (such as a
>>>> Row or Vector type) or do you plan to support arbitrary records such as
>>>> Pojos?
>>>> If you plan for a specific type, you could add a TypeInformation for
>>>> this
>>>> type with a TypeSerializer that is based on ProtoBuf.
>>>>
>>>> The way I look at it is slightly different. The common format for
>>>> records, supported by Flink, is Byte array with a little bit of header,
>>>> describing data type and is used for routing. The actual unmarshalling
>>>> is
>>>> done by the model implementation itself. This provides the maximum
>>>> flexibility and gives user the freedom to create his own types without
>>>> breaking underlying framework.
>>>>
>>>> Ad 4) @Boris: I made this point not about the serialization format but
>>>> how the library would integrate with Flink's DataStream API.
>>>> I thought I had seen a code snippet that showed a new method on the
>>>> DataStream object but cannot find this anymore.
>>>> So, I just wanted to make the point that we should not change the
>>>> DataStream API (unless it lacks support for some features) and built the
>>>> model serving library on top of it.
>>>> But I get from Stavros answer that this is your design anyway.
>>>>
>>>> Ad 5) The metrics system is the default way to expose system and job
>>>> metrics in Flink. Due to the pluggable reporter interface and various
>>>> reporters, they can be easily integrated in many production
>>>> environments.
>>>> A solution based on queryable state will always need custom code to
>>>> access the information. Of course this can be an optional feature.
>>>>
>>>> What do others think about this proposal?
>>>>
>>>> We had agreement among work group - Eron, Bas, Andrea, etc, but you are
>>>> the first one outside of it. My book https://www.lightbend.com
>>>> /blog/serving-machine-learning-models-free-oreilly-ebook-from-lightbend
>>>> has
>>>>
>>>> a reasonably good reviews, so we are hoping this will work
>>>>
>>>>
>>>> Best, Fabian
>>>>
>>>>
>>>> 2017-11-28 13:53 GMT+01:00 Stavros Kontopoulos <
>>>> st.kontopou...@gmail.com>
>>>> :
>>>>
>>>> Hi Fabian thanx!
>>>>>
>>>>>
>>>>> 1) Is it a strict requirement that a ML pipeline must be able to handle
>>>>>> different input types?
>>>>>> I understand that it makes sense to have different models for
>>>>>> different
>>>>>> instances of the same type, i.e., same data type but different keys.
>>>>>>
>>>>> Hence,
>>>>>
>>>>>> the key-based joins make sense to me. However, couldn't completely
>>>>>> different types be handled by different ML pipelines or would there be
>>>>>> major drawbacks?
>>>>>>
>>>>>
>>>>> Could you elaborate more on this? Right now we only use keys when we do
>>>>> the
>>>>> join. A given pipeline can handle only a well defined type (the type
>>>>> can
>>>>> be
>>>>> a simple string with a custom value, no need to be a
>>>>> class type) which serves as a key.
>>>>>
>>>>> 2)
>>>>>
>>>>> I think from an API point of view it would be better to not require
>>>>>
>>>>>> input records to be encoded as ProtoBuf messages. Instead, the model
>>>>>>
>>>>> server
>>>>>
>>>>>> could accept strongly-typed objects (Java/Scala) and (if necessary)
>>>>>>
>>>>> convert
>>>>>
>>>>>> them to ProtoBuf messages internally. In case we need to support
>>>>>>
>>>>> different
>>>>>
>>>>>> types of records (see my first point), we can introduce a Union type
>>>>>>
>>>>> (i.e.,
>>>>>
>>>>>> an n-ary Either type). I see that we need some kind of binary encoding
>>>>>> format for the models but maybe also this can be designed to be
>>>>>>
>>>>> pluggable
>>>>>
>>>>>> such that later other encodings can be added.
>>>>>>
>>>>>>   We do uses scala classes (strongly typed classes), protobuf is only
>>>>> used
>>>>> on the wire. For on the wire encoding we prefer protobufs for size,
>>>>> expressiveness and ability to represent different data types.
>>>>>
>>>>> 3)
>>>>>
>>>>> I think the DataStream Java API should be supported as a first class
>>>>>
>>>>>> citizens for this library.
>>>>>>
>>>>>
>>>>> I agree. It should be either first priority or a next thing to do.
>>>>>
>>>>>
>>>>> 4)
>>>>>
>>>>> For the integration with the DataStream API, we could provide an API
>>>>> that
>>>>>
>>>>>> receives (typed) DataStream objects, internally constructs the
>>>>>>
>>>>> DataStream
>>>>>
>>>>>> operators, and returns one (or more) result DataStreams. The benefit
>>>>>> is
>>>>>> that we don't need to change the DataStream API directly, but put a
>>>>>>
>>>>> library
>>>>>
>>>>>> on top. The other libraries (CEP, Table, Gelly) follow this approach.
>>>>>>
>>>>>
>>>>>   We will provide a DSL which will do jsut this. But even without the
>>>>> DSL
>>>>> this is what we do with low level joins.
>>>>>
>>>>>
>>>>> 5)
>>>>>
>>>>> I'm skeptical about using queryable state to expose metrics. Did you
>>>>>> consider using Flink's metrics system [1]? It is easily configurable
>>>>>>
>>>>> and we
>>>>>
>>>>>> provided several reporters that export the metrics.
>>>>>>
>>>>>> This of course is an option. The choice of queryable state was mostly
>>>>> driven by a simplicity of real time integration.  Any reason why
>>>>> metrics
>>>>> system is netter?
>>>>>
>>>>>
>>>>> Best,
>>>>> Stavros
>>>>>
>>>>> On Mon, Nov 27, 2017 at 4:23 PM, Fabian Hueske <fhue...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> Hi Stavros,
>>>>>>
>>>>>> thanks for the detailed FLIP!
>>>>>> Model serving is an important use case and it's great to see efforts
>>>>>>
>>>>> to add
>>>>>
>>>>>> a library for this to Flink!
>>>>>>
>>>>>> I've read the FLIP and would like to ask a few questions and make some
>>>>>> suggestions.
>>>>>>
>>>>>> 1) Is it a strict requirement that a ML pipeline must be able to
>>>>>> handle
>>>>>> different input types?
>>>>>> I understand that it makes sense to have different models for
>>>>>> different
>>>>>> instances of the same type, i.e., same data type but different keys.
>>>>>>
>>>>> Hence,
>>>>>
>>>>>> the key-based joins make sense to me. However, couldn't completely
>>>>>> different types be handled by different ML pipelines or would there be
>>>>>> major drawbacks?
>>>>>>
>>>>>> 2) I think from an API point of view it would be better to not require
>>>>>> input records to be encoded as ProtoBuf messages. Instead, the model
>>>>>>
>>>>> server
>>>>>
>>>>>> could accept strongly-typed objects (Java/Scala) and (if necessary)
>>>>>>
>>>>> convert
>>>>>
>>>>>> them to ProtoBuf messages internally. In case we need to support
>>>>>>
>>>>> different
>>>>>
>>>>>> types of records (see my first point), we can introduce a Union type
>>>>>>
>>>>> (i.e.,
>>>>>
>>>>>> an n-ary Either type). I see that we need some kind of binary encoding
>>>>>> format for the models but maybe also this can be designed to be
>>>>>>
>>>>> pluggable
>>>>>
>>>>>> such that later other encodings can be added.
>>>>>>
>>>>>> 3) I think the DataStream Java API should be supported as a first
>>>>>> class
>>>>>> citizens for this library.
>>>>>>
>>>>>> 4) For the integration with the DataStream API, we could provide an
>>>>>> API
>>>>>> that receives (typed) DataStream objects, internally constructs the
>>>>>> DataStream operators, and returns one (or more) result DataStreams.
>>>>>> The
>>>>>> benefit is that we don't need to change the DataStream API directly,
>>>>>>
>>>>> but
>>>>>
>>>>>> put a library on top. The other libraries (CEP, Table, Gelly) follow
>>>>>>
>>>>> this
>>>>>
>>>>>> approach.
>>>>>>
>>>>>> 5) I'm skeptical about using queryable state to expose metrics. Did
>>>>>> you
>>>>>> consider using Flink's metrics system [1]? It is easily configurable
>>>>>>
>>>>> and we
>>>>>
>>>>>> provided several reporters that export the metrics.
>>>>>>
>>>>>> What do you think?
>>>>>> Best, Fabian
>>>>>>
>>>>>> [1]
>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.3/
>>>>>>
>>>>> monitoring/
>>>>>
>>>>>> metrics.html
>>>>>>
>>>>>> 2017-11-23 12:32 GMT+01:00 Stavros Kontopoulos <
>>>>>>
>>>>> st.kontopou...@gmail.com>:
>>>>>
>>>>>> Hi guys,
>>>>>>>
>>>>>>> Let's discuss the new FLIP proposal for model serving over Flink. The
>>>>>>>
>>>>>> idea
>>>>>>
>>>>>>> is to combine previous efforts there and provide a library on top of
>>>>>>>
>>>>>> Flink
>>>>>>
>>>>>>> for serving models.
>>>>>>>
>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-
>>>>>>>
>>>>>> 23+-+Model+Serving
>>>>>>
>>>>>>> Code from previous efforts can be found here:
>>>>>>>
>>>>>> https://github.com/FlinkML
>>>>>
>>>>>> Best,
>>>>>>> Stavros
>>>>>>>
>>>>>>>
>>>>
>>>>
>

Re: [DISCUSS] FLIP-23 Model Serving

Reply via email to