Hi sonia, As far as I know, pyflink users prefer to use python udf[1][2] for model prediction. Load the model when the udf is initialized, and then predict each new piece of data
[1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/table/udfs/overview/ [2] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/python/datastream/operators/process_function/ Best, Xingbo David Anderson <[email protected]> 于2022年1月11日周二 03:39写道: > Another approach that I find quite natural is to use Flink's Stateful > Functions API [1] for model serving, and this has some nice advantages, > such as zero-downtime deployments of new models, and the ease with which > you can use Python. [2] is an example of this approach. > > [1] https://flink.apache.org/stateful-functions.html > [2] https://github.com/ververica/flink-statefun-workshop > > On Fri, Jan 7, 2022 at 5:55 PM Yun Gao <[email protected]> wrote: > >> Hi Sonia, >> >> Sorry I might not have the statistics on the provided two methods, >> perhaps as input >> I could also provide another method: currently there is an eco-project >> dl-on-flink >> that supports running DL frameworks on top of the Flink and it will >> handle the data >> exchange between java and python processes, which would allows to user >> the native >> model directly. >> >> Best, >> Yun >> >> >> [1] https://github.com/flink-extended/dl-on-flink >> >> >> >> ------------------------------------------------------------------ >> From:Sonia-Florina Horchidan <[email protected]> >> Send Time:2022 Jan. 7 (Fri.) 17:23 >> To:[email protected] <[email protected]> >> Subject:Serving Machine Learning models >> >> Hello, >> >> >> I recently started looking into serving Machine Learning models for >> streaming data in Flink. To give more context, that would involve training >> a model offline (using PyTorch or TensorFlow), and calling it from inside a >> Flink job to do online inference on newly arrived data. I have found >> multiple discussions, presentations, and tools that could achieve this, and >> it seems like the two alternatives would be: (1) wrap the pre-trained >> models in a HTTP service (such as PyTorch Serve [1]) and let Flink do async >> calls for model scoring, or (2) convert the models into a standardized >> format (e.g., ONNX [2]), pre-load the model in memory for every task >> manager (or use external storage if needed) and call it for each new data >> point. >> >> Both approaches come with a set of advantages and drawbacks and, as far >> as I understand, there is no "silver bullet", since one approach could be >> more suitable than the other based on the application requirements. >> However, I would be curious to know what would be the "recommended" methods >> for model serving (if any) and what approaches are currently adopted by the >> users in the wild. >> >> [1] https://pytorch.org/serve/ >> >> [2] https://onnx.ai/ >> >> Best regards, >> >> Sonia >> >> >> [image: Kth Logo] >> >> Sonia-Florina Horchidan >> PhD Student >> KTH Royal Institute of Technology >> *Software and Computer Systems (SCS)* >> School of Electrical Engineering and Computer Science (EECS) >> Mobil: +46769751562 >> <[email protected]>[email protected], <http://www.kth.se/>www.kth.se >> >> >>
