Re: Serving Machine Learning models
Hi sonia, As far as I know, pyflink users prefer to use python udf[1][2] for model prediction. Load the model when the udf is initialized, and then predict each new piece of data [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/table/udfs/overview/ [2] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/datastream/operators/process_function/ Best, Xingbo David Anderson 于2022年1月11日周二 03:39写道: > Another approach that I find quite natural is to use Flink's Stateful > Functions API [1] for model serving, and this has some nice advantages, > such as zero-downtime deployments of new models, and the ease with which > you can use Python. [2] is an example of this approach. > > [1] https://flink.apache.org/stateful-functions.html > [2] https://github.com/ververica/flink-statefun-workshop > > On Fri, Jan 7, 2022 at 5:55 PM Yun Gao wrote: > >> Hi Sonia, >> >> Sorry I might not have the statistics on the provided two methods, >> perhaps as input >> I could also provide another method: currently there is an eco-project >> dl-on-flink >> that supports running DL frameworks on top of the Flink and it will >> handle the data >> exchange between java and python processes, which would allows to user >> the native >> model directly. >> >> Best, >> Yun >> >> >> [1] https://github.com/flink-extended/dl-on-flink >> >> >> >> -- >> From:Sonia-Florina Horchidan >> Send Time:2022 Jan. 7 (Fri.) 17:23 >> To:user@flink.apache.org >> Subject:Serving Machine Learning models >> >> Hello, >> >> >> I recently started looking into serving Machine Learning models for >> streaming data in Flink. To give more context, that would involve training >> a model offline (using PyTorch or TensorFlow), and calling it from inside a >> Flink job to do online inference on newly arrived data. I have found >> multiple discussions, presentations, and tools that could achieve this, and >> it seems like the two alternatives would be: (1) wrap the pre-trained >> models in a HTTP service (such as PyTorch Serve [1]) and let Flink do async >> calls for model scoring, or (2) convert the models into a standardized >> format (e.g., ONNX [2]), pre-load the model in memory for every task >> manager (or use external storage if needed) and call it for each new data >> point. >> >> Both approaches come with a set of advantages and drawbacks and, as far >> as I understand, there is no "silver bullet", since one approach could be >> more suitable than the other based on the application requirements. >> However, I would be curious to know what would be the "recommended" methods >> for model serving (if any) and what approaches are currently adopted by the >> users in the wild. >> >> [1] https://pytorch.org/serve/ >> >> [2] https://onnx.ai/ >> >> Best regards, >> >> Sonia >> >> >> [image: Kth Logo] >> >> Sonia-Florina Horchidan >> PhD Student >> KTH Royal Institute of Technology >> *Software and Computer Systems (SCS)* >> School of Electrical Engineering and Computer Science (EECS) >> Mobil: +46769751562 >> sf...@kth.se, <http://www.kth.se/>www.kth.se >> >> >>
Re: Serving Machine Learning models
Another approach that I find quite natural is to use Flink's Stateful Functions API [1] for model serving, and this has some nice advantages, such as zero-downtime deployments of new models, and the ease with which you can use Python. [2] is an example of this approach. [1] https://flink.apache.org/stateful-functions.html [2] https://github.com/ververica/flink-statefun-workshop On Fri, Jan 7, 2022 at 5:55 PM Yun Gao wrote: > Hi Sonia, > > Sorry I might not have the statistics on the provided two methods, perhaps > as input > I could also provide another method: currently there is an eco-project > dl-on-flink > that supports running DL frameworks on top of the Flink and it will handle > the data > exchange between java and python processes, which would allows to user the > native > model directly. > > Best, > Yun > > > [1] https://github.com/flink-extended/dl-on-flink > > > > -- > From:Sonia-Florina Horchidan > Send Time:2022 Jan. 7 (Fri.) 17:23 > To:user@flink.apache.org > Subject:Serving Machine Learning models > > Hello, > > > I recently started looking into serving Machine Learning models for > streaming data in Flink. To give more context, that would involve training > a model offline (using PyTorch or TensorFlow), and calling it from inside a > Flink job to do online inference on newly arrived data. I have found > multiple discussions, presentations, and tools that could achieve this, and > it seems like the two alternatives would be: (1) wrap the pre-trained > models in a HTTP service (such as PyTorch Serve [1]) and let Flink do async > calls for model scoring, or (2) convert the models into a standardized > format (e.g., ONNX [2]), pre-load the model in memory for every task > manager (or use external storage if needed) and call it for each new data > point. > > Both approaches come with a set of advantages and drawbacks and, as far as > I understand, there is no "silver bullet", since one approach could be more > suitable than the other based on the application requirements. However, I > would be curious to know what would be the "recommended" methods for model > serving (if any) and what approaches are currently adopted by the users in > the wild. > > [1] https://pytorch.org/serve/ > > [2] https://onnx.ai/ > > Best regards, > > Sonia > > > [image: Kth Logo] > > Sonia-Florina Horchidan > PhD Student > KTH Royal Institute of Technology > *Software and Computer Systems (SCS)* > School of Electrical Engineering and Computer Science (EECS) > Mobil: +46769751562 > sf...@kth.se, <http://www.kth.se/>www.kth.se > > >
Re: Serving Machine Learning models
Hi Sonia, Sorry I might not have the statistics on the provided two methods, perhaps as input I could also provide another method: currently there is an eco-project dl-on-flink that supports running DL frameworks on top of the Flink and it will handle the data exchange between java and python processes, which would allows to user the native model directly. Best, Yun [1] https://github.com/flink-extended/dl-on-flink -- From:Sonia-Florina Horchidan Send Time:2022 Jan. 7 (Fri.) 17:23 To:user@flink.apache.org Subject:Serving Machine Learning models Hello, I recently started looking into serving Machine Learning models for streaming data in Flink. To give more context, that would involve training a model offline (using PyTorch or TensorFlow), and calling it from inside a Flink job to do online inference on newly arrived data. I have found multiple discussions, presentations, and tools that could achieve this, and it seems like the two alternatives would be: (1) wrap the pre-trained models in a HTTP service (such as PyTorch Serve [1]) and let Flink do async calls for model scoring, or (2) convert the models into a standardized format (e.g., ONNX [2]), pre-load the model in memory for every task manager (or use external storage if needed) and call it for each new data point. Both approaches come with a set of advantages and drawbacks and, as far as I understand, there is no "silver bullet", since one approach could be more suitable than the other based on the application requirements. However, I would be curious to know what would be the "recommended" methods for model serving (if any) and what approaches are currently adopted by the users in the wild. [1] https://pytorch.org/serve/ [2] https://onnx.ai/ Best regards, Sonia [Kth Logo] Sonia-Florina Horchidan PhD Student KTH Royal Institute of Technology Software and Computer Systems (SCS) School of Electrical Engineering and Computer Science (EECS) Mobil: +46769751562 sf...@kth.se, www.kth.se
Serving Machine Learning models
Hello, I recently started looking into serving Machine Learning models for streaming data in Flink. To give more context, that would involve training a model offline (using PyTorch or TensorFlow), and calling it from inside a Flink job to do online inference on newly arrived data. I have found multiple discussions, presentations, and tools that could achieve this, and it seems like the two alternatives would be: (1) wrap the pre-trained models in a HTTP service (such as PyTorch Serve [1]) and let Flink do async calls for model scoring, or (2) convert the models into a standardized format (e.g., ONNX [2]), pre-load the model in memory for every task manager (or use external storage if needed) and call it for each new data point. Both approaches come with a set of advantages and drawbacks and, as far as I understand, there is no "silver bullet", since one approach could be more suitable than the other based on the application requirements. However, I would be curious to know what would be the "recommended" methods for model serving (if any) and what approaches are currently adopted by the users in the wild. [1] https://pytorch.org/serve/ [2] https://onnx.ai/ Best regards, Sonia [Kth Logo] Sonia-Florina Horchidan PhD Student KTH Royal Institute of Technology Software and Computer Systems (SCS) School of Electrical Engineering and Computer Science (EECS) Mobil: +46769751562 <mailto:ann...@kth.se>sf...@kth.se<mailto:sf...@kth.se>, <http://www.kth.se/> www.kth.se<http://www.kth.se>