Re: [DISCUSS] FLIP-23 Model Serving

Stavros Kontopoulos Mon, 05 Feb 2018 04:11:57 -0800

Thanx @Fabian. I will update the document accordingly wrt metrics.
I agree there are pros and cons.


Best,
Stavros


On Wed, Jan 31, 2018 at 1:07 AM, Fabian Hueske <fhue...@gmail.com> wrote:

> OK, I think there was plenty of time to comment on this FLIP.
> I'll move it to the ACCEPTED status.
>
> @Stavros, please consider the feedback regarding the metrics.
> I agree with Chesnay that metrics should be primarily exposed via the
> metrics system.
> Storing them in state makes them fault-tolerant and queryable if the state
> is properly configured.
>
> Thanks,
> Fabian
>
> 2018-01-22 17:19 GMT+01:00 Chesnay Schepler <ches...@apache.org>:
>
> > I'm currently looking over it, but one thing that stood out was that the
> > FLIP proposes to use queryable state
> > as a monitoring solution. Given that we have a metric system that
> > integrates with plenty of commonly used
> > metric backends this doesn't really make sense to me.
> >
> > Storing them in state still has value in terms of fault-tolerance though,
> > since this is something that the metric
> > system doesn't provide by itself.
> >
> >
> > On 18.01.2018 13:57, Fabian Hueske wrote:
> >
> >> Are there any more comments on the FLIP?
> >>
> >> Otherwise, I'd suggest to move the FLIP to the accepted FLIPs [1] and
> >> continue with the implementation.
> >>
> >> Also, is there a committer who'd like to shepherd the FLIP and review
> the
> >> corresponding PRs?
> >> Of course, everybody is welcome to review the code but we need at least
> >> one
> >> committer who will eventually merge the changes.
> >>
> >> Best,
> >> Fabian
> >>
> >> [1]
> >> https://cwiki.apache.org/confluence/display/FLINK/Flink+
> >> Improvement+Proposals
> >>
> >> 2017-12-04 10:54 GMT+01:00 Fabian Hueske <fhue...@gmail.com>:
> >>
> >> Hi,
> >>>
> >>> Sorry for the late follow up.
> >>>
> >>> I think I understand the motivation for choosing ProtoBuf as the
> >>> representation and serialization format and this makes sense to me.
> >>>
> >>> However, it might be a good idea to provide tooling to convert Flink
> >>> types
> >>> (described as TypeInformation) to ProtoBuf.
> >>> Otherwise, users of the model serving library would need to manually
> >>> convert their data types (say Scala tuples, case classes, or Avro
> Pojos)
> >>> to
> >>> ProtoBuf messages.
> >>> I don't think that this needs to be included in the first version but
> it
> >>> might be a good extension to make the library easier to use.
> >>>
> >>> Best,
> >>> Fabian
> >>>
> >>>
> >>>
> >>> 2017-11-28 17:22 GMT+01:00 Boris Lublinsky <
> >>> boris.lublin...@lightbend.com>
> >>> :
> >>>
> >>> Thanks Fabian,
> >>>> More below
> >>>>
> >>>>
> >>>>
> >>>> Boris Lublinsky
> >>>> FDP Architect
> >>>> boris.lublin...@lightbend.com
> >>>> https://www.lightbend.com/
> >>>>
> >>>> On Nov 28, 2017, at 8:21 AM, Fabian Hueske <fhue...@gmail.com> wrote:
> >>>>
> >>>> Hi Boris and Stavros,
> >>>>
> >>>> Thanks for the responses.
> >>>>
> >>>> Ad 1) Thanks for the clarification. I think I misunderstood this part
> of
> >>>> the proposal.
> >>>> I interpreted the argument why to chose ProtoBuf for network encoding
> >>>> ("ability
> >>>> to represent different data types") such that different a model
> pipeline
> >>>> should work on different data types.
> >>>> I agree that it should be possible to give records of the same type
> (but
> >>>> with different keys) to different models. The key-based join approach
> >>>> looks
> >>>> good to me.
> >>>>
> >>>> Ad 2) I understand that ProtoBuf is a good choice to serialize models
> >>>> for
> >>>> the given reasons.
> >>>> However, the choice of ProtoBuf serialization for the records might
> make
> >>>> the integration with existing libraries and also regular DataStream
> >>>> programs more difficult.
> >>>> They all use Flink's TypeSerializer system to serialize and
> deserialize
> >>>> records by default. Hence, we would need to add a conversion step
> before
> >>>> records can be passed to a model serving operator.
> >>>> Are you expecting some common format that all records follow (such as
> a
> >>>> Row or Vector type) or do you plan to support arbitrary records such
> as
> >>>> Pojos?
> >>>> If you plan for a specific type, you could add a TypeInformation for
> >>>> this
> >>>> type with a TypeSerializer that is based on ProtoBuf.
> >>>>
> >>>> The way I look at it is slightly different. The common format for
> >>>> records, supported by Flink, is Byte array with a little bit of
> header,
> >>>> describing data type and is used for routing. The actual unmarshalling
> >>>> is
> >>>> done by the model implementation itself. This provides the maximum
> >>>> flexibility and gives user the freedom to create his own types without
> >>>> breaking underlying framework.
> >>>>
> >>>> Ad 4) @Boris: I made this point not about the serialization format but
> >>>> how the library would integrate with Flink's DataStream API.
> >>>> I thought I had seen a code snippet that showed a new method on the
> >>>> DataStream object but cannot find this anymore.
> >>>> So, I just wanted to make the point that we should not change the
> >>>> DataStream API (unless it lacks support for some features) and built
> the
> >>>> model serving library on top of it.
> >>>> But I get from Stavros answer that this is your design anyway.
> >>>>
> >>>> Ad 5) The metrics system is the default way to expose system and job
> >>>> metrics in Flink. Due to the pluggable reporter interface and various
> >>>> reporters, they can be easily integrated in many production
> >>>> environments.
> >>>> A solution based on queryable state will always need custom code to
> >>>> access the information. Of course this can be an optional feature.
> >>>>
> >>>> What do others think about this proposal?
> >>>>
> >>>> We had agreement among work group - Eron, Bas, Andrea, etc, but you
> are
> >>>> the first one outside of it. My book https://www.lightbend.com
> >>>> /blog/serving-machine-learning-models-free-oreilly-
> ebook-from-lightbend
> >>>> has
> >>>>
> >>>> a reasonably good reviews, so we are hoping this will work
> >>>>
> >>>>
> >>>> Best, Fabian
> >>>>
> >>>>
> >>>> 2017-11-28 13:53 GMT+01:00 Stavros Kontopoulos <
> >>>> st.kontopou...@gmail.com>
> >>>> :
> >>>>
> >>>> Hi Fabian thanx!
> >>>>>
> >>>>>
> >>>>> 1) Is it a strict requirement that a ML pipeline must be able to
> handle
> >>>>>> different input types?
> >>>>>> I understand that it makes sense to have different models for
> >>>>>> different
> >>>>>> instances of the same type, i.e., same data type but different keys.
> >>>>>>
> >>>>> Hence,
> >>>>>
> >>>>>> the key-based joins make sense to me. However, couldn't completely
> >>>>>> different types be handled by different ML pipelines or would there
> be
> >>>>>> major drawbacks?
> >>>>>>
> >>>>>
> >>>>> Could you elaborate more on this? Right now we only use keys when we
> do
> >>>>> the
> >>>>> join. A given pipeline can handle only a well defined type (the type
> >>>>> can
> >>>>> be
> >>>>> a simple string with a custom value, no need to be a
> >>>>> class type) which serves as a key.
> >>>>>
> >>>>> 2)
> >>>>>
> >>>>> I think from an API point of view it would be better to not require
> >>>>>
> >>>>>> input records to be encoded as ProtoBuf messages. Instead, the model
> >>>>>>
> >>>>> server
> >>>>>
> >>>>>> could accept strongly-typed objects (Java/Scala) and (if necessary)
> >>>>>>
> >>>>> convert
> >>>>>
> >>>>>> them to ProtoBuf messages internally. In case we need to support
> >>>>>>
> >>>>> different
> >>>>>
> >>>>>> types of records (see my first point), we can introduce a Union type
> >>>>>>
> >>>>> (i.e.,
> >>>>>
> >>>>>> an n-ary Either type). I see that we need some kind of binary
> encoding
> >>>>>> format for the models but maybe also this can be designed to be
> >>>>>>
> >>>>> pluggable
> >>>>>
> >>>>>> such that later other encodings can be added.
> >>>>>>
> >>>>>>   We do uses scala classes (strongly typed classes), protobuf is
> only
> >>>>> used
> >>>>> on the wire. For on the wire encoding we prefer protobufs for size,
> >>>>> expressiveness and ability to represent different data types.
> >>>>>
> >>>>> 3)
> >>>>>
> >>>>> I think the DataStream Java API should be supported as a first class
> >>>>>
> >>>>>> citizens for this library.
> >>>>>>
> >>>>>
> >>>>> I agree. It should be either first priority or a next thing to do.
> >>>>>
> >>>>>
> >>>>> 4)
> >>>>>
> >>>>> For the integration with the DataStream API, we could provide an API
> >>>>> that
> >>>>>
> >>>>>> receives (typed) DataStream objects, internally constructs the
> >>>>>>
> >>>>> DataStream
> >>>>>
> >>>>>> operators, and returns one (or more) result DataStreams. The benefit
> >>>>>> is
> >>>>>> that we don't need to change the DataStream API directly, but put a
> >>>>>>
> >>>>> library
> >>>>>
> >>>>>> on top. The other libraries (CEP, Table, Gelly) follow this
> approach.
> >>>>>>
> >>>>>
> >>>>>   We will provide a DSL which will do jsut this. But even without the
> >>>>> DSL
> >>>>> this is what we do with low level joins.
> >>>>>
> >>>>>
> >>>>> 5)
> >>>>>
> >>>>> I'm skeptical about using queryable state to expose metrics. Did you
> >>>>>> consider using Flink's metrics system [1]? It is easily configurable
> >>>>>>
> >>>>> and we
> >>>>>
> >>>>>> provided several reporters that export the metrics.
> >>>>>>
> >>>>>> This of course is an option. The choice of queryable state was
> mostly
> >>>>> driven by a simplicity of real time integration.  Any reason why
> >>>>> metrics
> >>>>> system is netter?
> >>>>>
> >>>>>
> >>>>> Best,
> >>>>> Stavros
> >>>>>
> >>>>> On Mon, Nov 27, 2017 at 4:23 PM, Fabian Hueske <fhue...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> Hi Stavros,
> >>>>>>
> >>>>>> thanks for the detailed FLIP!
> >>>>>> Model serving is an important use case and it's great to see efforts
> >>>>>>
> >>>>> to add
> >>>>>
> >>>>>> a library for this to Flink!
> >>>>>>
> >>>>>> I've read the FLIP and would like to ask a few questions and make
> some
> >>>>>> suggestions.
> >>>>>>
> >>>>>> 1) Is it a strict requirement that a ML pipeline must be able to
> >>>>>> handle
> >>>>>> different input types?
> >>>>>> I understand that it makes sense to have different models for
> >>>>>> different
> >>>>>> instances of the same type, i.e., same data type but different keys.
> >>>>>>
> >>>>> Hence,
> >>>>>
> >>>>>> the key-based joins make sense to me. However, couldn't completely
> >>>>>> different types be handled by different ML pipelines or would there
> be
> >>>>>> major drawbacks?
> >>>>>>
> >>>>>> 2) I think from an API point of view it would be better to not
> require
> >>>>>> input records to be encoded as ProtoBuf messages. Instead, the model
> >>>>>>
> >>>>> server
> >>>>>
> >>>>>> could accept strongly-typed objects (Java/Scala) and (if necessary)
> >>>>>>
> >>>>> convert
> >>>>>
> >>>>>> them to ProtoBuf messages internally. In case we need to support
> >>>>>>
> >>>>> different
> >>>>>
> >>>>>> types of records (see my first point), we can introduce a Union type
> >>>>>>
> >>>>> (i.e.,
> >>>>>
> >>>>>> an n-ary Either type). I see that we need some kind of binary
> encoding
> >>>>>> format for the models but maybe also this can be designed to be
> >>>>>>
> >>>>> pluggable
> >>>>>
> >>>>>> such that later other encodings can be added.
> >>>>>>
> >>>>>> 3) I think the DataStream Java API should be supported as a first
> >>>>>> class
> >>>>>> citizens for this library.
> >>>>>>
> >>>>>> 4) For the integration with the DataStream API, we could provide an
> >>>>>> API
> >>>>>> that receives (typed) DataStream objects, internally constructs the
> >>>>>> DataStream operators, and returns one (or more) result DataStreams.
> >>>>>> The
> >>>>>> benefit is that we don't need to change the DataStream API directly,
> >>>>>>
> >>>>> but
> >>>>>
> >>>>>> put a library on top. The other libraries (CEP, Table, Gelly) follow
> >>>>>>
> >>>>> this
> >>>>>
> >>>>>> approach.
> >>>>>>
> >>>>>> 5) I'm skeptical about using queryable state to expose metrics. Did
> >>>>>> you
> >>>>>> consider using Flink's metrics system [1]? It is easily configurable
> >>>>>>
> >>>>> and we
> >>>>>
> >>>>>> provided several reporters that export the metrics.
> >>>>>>
> >>>>>> What do you think?
> >>>>>> Best, Fabian
> >>>>>>
> >>>>>> [1]
> >>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.3/
> >>>>>>
> >>>>> monitoring/
> >>>>>
> >>>>>> metrics.html
> >>>>>>
> >>>>>> 2017-11-23 12:32 GMT+01:00 Stavros Kontopoulos <
> >>>>>>
> >>>>> st.kontopou...@gmail.com>:
> >>>>>
> >>>>>> Hi guys,
> >>>>>>>
> >>>>>>> Let's discuss the new FLIP proposal for model serving over Flink.
> The
> >>>>>>>
> >>>>>> idea
> >>>>>>
> >>>>>>> is to combine previous efforts there and provide a library on top
> of
> >>>>>>>
> >>>>>> Flink
> >>>>>>
> >>>>>>> for serving models.
> >>>>>>>
> >>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-
> >>>>>>>
> >>>>>> 23+-+Model+Serving
> >>>>>>
> >>>>>>> Code from previous efforts can be found here:
> >>>>>>>
> >>>>>> https://github.com/FlinkML
> >>>>>
> >>>>>> Best,
> >>>>>>> Stavros
> >>>>>>>
> >>>>>>>
> >>>>
> >>>>
> >
>

Re: [DISCUSS] FLIP-23 Model Serving

Reply via email to