Re: Serving Machine Learning models

2022-01-11 Thread Xingbo Huang
Hi sonia,

As far as I know, pyflink users prefer to use python udf[1][2] for model
prediction. Load the model when the udf is initialized, and then predict
each new piece of data

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/table/udfs/overview/
[2]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/datastream/operators/process_function/

Best,
Xingbo

David Anderson  于2022年1月11日周二 03:39写道:

> Another approach that I find quite natural is to use Flink's Stateful
> Functions API [1] for model serving, and this has some nice advantages,
> such as zero-downtime deployments of new models, and the ease with which
> you can use Python. [2] is an example of this approach.
>
> [1] https://flink.apache.org/stateful-functions.html
> [2] https://github.com/ververica/flink-statefun-workshop
>
> On Fri, Jan 7, 2022 at 5:55 PM Yun Gao  wrote:
>
>> Hi Sonia,
>>
>> Sorry I might not have the statistics on the provided two methods,
>> perhaps as input
>> I could also provide another method: currently there is an eco-project
>> dl-on-flink
>> that supports running DL frameworks on top of the Flink and it will
>> handle the data
>> exchange between java and python processes, which would allows to user
>> the native
>> model directly.
>>
>> Best,
>> Yun
>>
>>
>> [1] https://github.com/flink-extended/dl-on-flink
>>
>>
>>
>> --
>> From:Sonia-Florina Horchidan 
>> Send Time:2022 Jan. 7 (Fri.) 17:23
>> To:user@flink.apache.org 
>> Subject:Serving Machine Learning models
>>
>> Hello,
>>
>>
>> I recently started looking into serving Machine Learning models for
>> streaming data in Flink. To give more context, that would involve training
>> a model offline (using PyTorch or TensorFlow), and calling it from inside a
>> Flink job to do online inference on newly arrived data. I have found
>> multiple discussions, presentations, and tools that could achieve this, and
>> it seems like the two alternatives would be: (1) wrap the pre-trained
>> models in a HTTP service (such as PyTorch Serve [1]) and let Flink do async
>> calls for model scoring, or (2) convert the models into a standardized
>> format (e.g., ONNX [2]), pre-load the model in memory for every task
>> manager (or use external storage if needed) and call it for each new data
>> point.
>>
>> Both approaches come with a set of advantages and drawbacks and, as far
>> as I understand, there is no "silver bullet", since one approach could be
>> more suitable than the other based on the application requirements.
>> However, I would be curious to know what would be the "recommended" methods
>> for model serving (if any) and what approaches are currently adopted by the
>> users in the wild.
>>
>> [1] https://pytorch.org/serve/
>>
>> [2] https://onnx.ai/
>>
>> Best regards,
>>
>> Sonia
>>
>>
>>  [image: Kth Logo]
>>
>> Sonia-Florina Horchidan
>> PhD Student
>> KTH Royal Institute of Technology
>> *Software and Computer Systems (SCS)*
>> School of Electrical Engineering and Computer Science (EECS)
>> Mobil: +46769751562
>> sf...@kth.se,  <http://www.kth.se/>www.kth.se
>>
>>
>>


Re: Serving Machine Learning models

2022-01-10 Thread David Anderson
Another approach that I find quite natural is to use Flink's Stateful
Functions API [1] for model serving, and this has some nice advantages,
such as zero-downtime deployments of new models, and the ease with which
you can use Python. [2] is an example of this approach.

[1] https://flink.apache.org/stateful-functions.html
[2] https://github.com/ververica/flink-statefun-workshop

On Fri, Jan 7, 2022 at 5:55 PM Yun Gao  wrote:

> Hi Sonia,
>
> Sorry I might not have the statistics on the provided two methods, perhaps
> as input
> I could also provide another method: currently there is an eco-project
> dl-on-flink
> that supports running DL frameworks on top of the Flink and it will handle
> the data
> exchange between java and python processes, which would allows to user the
> native
> model directly.
>
> Best,
> Yun
>
>
> [1] https://github.com/flink-extended/dl-on-flink
>
>
>
> --
> From:Sonia-Florina Horchidan 
> Send Time:2022 Jan. 7 (Fri.) 17:23
> To:user@flink.apache.org 
> Subject:Serving Machine Learning models
>
> Hello,
>
>
> I recently started looking into serving Machine Learning models for
> streaming data in Flink. To give more context, that would involve training
> a model offline (using PyTorch or TensorFlow), and calling it from inside a
> Flink job to do online inference on newly arrived data. I have found
> multiple discussions, presentations, and tools that could achieve this, and
> it seems like the two alternatives would be: (1) wrap the pre-trained
> models in a HTTP service (such as PyTorch Serve [1]) and let Flink do async
> calls for model scoring, or (2) convert the models into a standardized
> format (e.g., ONNX [2]), pre-load the model in memory for every task
> manager (or use external storage if needed) and call it for each new data
> point.
>
> Both approaches come with a set of advantages and drawbacks and, as far as
> I understand, there is no "silver bullet", since one approach could be more
> suitable than the other based on the application requirements. However, I
> would be curious to know what would be the "recommended" methods for model
> serving (if any) and what approaches are currently adopted by the users in
> the wild.
>
> [1] https://pytorch.org/serve/
>
> [2] https://onnx.ai/
>
> Best regards,
>
> Sonia
>
>
>  [image: Kth Logo]
>
> Sonia-Florina Horchidan
> PhD Student
> KTH Royal Institute of Technology
> *Software and Computer Systems (SCS)*
> School of Electrical Engineering and Computer Science (EECS)
> Mobil: +46769751562
> sf...@kth.se,  <http://www.kth.se/>www.kth.se
>
>
>


Re: Serving Machine Learning models

2022-01-07 Thread Yun Gao
Hi Sonia,

Sorry I might not have the statistics on the provided two methods, perhaps as 
input
I could also provide another method: currently there is an eco-project 
dl-on-flink
that supports running DL frameworks on top of the Flink and it will handle the 
data
exchange between java and python processes, which would allows to user the 
native
model directly. 

Best,
Yun


[1] https://github.com/flink-extended/dl-on-flink




--
From:Sonia-Florina Horchidan 
Send Time:2022 Jan. 7 (Fri.) 17:23
To:user@flink.apache.org 
Subject:Serving Machine Learning models



Hello,

I recently started looking into serving Machine Learning models for streaming 
data in Flink. To give more context, that would involve training a model 
offline (using PyTorch or TensorFlow), and calling it from inside a Flink job 
to do online inference on newly arrived data. I have found multiple 
discussions, presentations, and tools that could achieve this, and it seems 
like the two alternatives would be: (1) wrap the pre-trained models in a HTTP 
service (such as PyTorch Serve [1]) and let Flink do async calls for model 
scoring, or (2) convert the models into a standardized format (e.g., ONNX [2]), 
pre-load the model in memory for every task manager (or use external storage if 
needed) and call it for each new data point. 
Both approaches come with a set of advantages and drawbacks and, as far as I 
understand, there is no "silver bullet", since one approach could be more 
suitable than the other based on the application requirements. However, I would 
be curious to know what would be the "recommended" methods for model serving 
(if any) and what approaches are currently adopted by the users in the wild.
[1] https://pytorch.org/serve/
[2] https://onnx.ai/
Best regards,
Sonia

 [Kth Logo]

Sonia-Florina Horchidan
PhD Student
KTH Royal Institute of Technology
Software and Computer Systems (SCS)
School of Electrical Engineering and Computer Science (EECS)
Mobil: +46769751562
sf...@kth.se, www.kth.se



Serving Machine Learning models

2022-01-07 Thread Sonia-Florina Horchidan
Hello,


I recently started looking into serving Machine Learning models for streaming 
data in Flink. To give more context, that would involve training a model 
offline (using PyTorch or TensorFlow), and calling it from inside a Flink job 
to do online inference on newly arrived data. I have found multiple 
discussions, presentations, and tools that could achieve this, and it seems 
like the two alternatives would be: (1) wrap the pre-trained models in a HTTP 
service (such as PyTorch Serve [1]) and let Flink do async calls for model 
scoring, or (2) convert the models into a standardized format (e.g., ONNX [2]), 
pre-load the model in memory for every task manager (or use external storage if 
needed) and call it for each new data point.

Both approaches come with a set of advantages and drawbacks and, as far as I 
understand, there is no "silver bullet", since one approach could be more 
suitable than the other based on the application requirements. However, I would 
be curious to know what would be the "recommended" methods for model serving 
(if any) and what approaches are currently adopted by the users in the wild.


[1] https://pytorch.org/serve/

[2] https://onnx.ai/


Best regards,

Sonia


 [Kth Logo]

Sonia-Florina Horchidan
PhD Student
KTH Royal Institute of Technology
Software and Computer Systems (SCS)
School of Electrical Engineering and Computer Science (EECS)
Mobil: +46769751562
<mailto:ann...@kth.se>sf...@kth.se<mailto:sf...@kth.se>, <http://www.kth.se/> 
www.kth.se<http://www.kth.se>