Thanks Matyas and Shengkai.
> 1. I'm wondering whether we could extend the SQL API to change how Python > models are loaded. For example, we could allow users to write: > > ``` > CREATE MODEL my_pytorch_model > WITH ( > 'type' = 'pytorch' > ) LANGUAGE PYTHON; > ``` Kind of thought about this, but one initial concern I had with this model is, Will a model provider be completely implemented in python itself ? When we refer to PyTorch or HuggingFace or ONNX model providers for example, do we need different behavior or optimizations related to Predictruntimecontext/model config building , batching or resource scheduling decisions .. which needs to be done in Java Flink entrypoints ? > 2. Beam already supports TensorFlow, ONNX, and many built-in models. Can we > reuse Beam's utilities to build Flink prediction functions[1]? Thanks, I will take a look at this to understand how it works and if we can learn from that design. > 3. It would be better if we introduced a PredictRuntimeContext to help > users download required weight files. Currently I have this as Model Config(set_model_config) , where this will be passed to PredictFunction in python. But PredictRuntimeContext seems more suitable . I will look into passing PredictRuntimeContext in open, similar to RuntimeContext for UDF's. > 4. In ML, users typically perform inference on batches of data. Therefore, > per-record evaluation may not be necessary. How about we just introduce API > like[2]? Yes, I completely agree on this. I was first aiming at agreement on model creation and provider api and then look at this in more detail. On Tue, Oct 14, 2025 at 10:39 AM Őrhidi Mátyás <[email protected]> wrote: > Hey Shengkai, > > Thank you for your observations. This proposal is mostly driven by > Swapna, but I could also share my thoughts here, please find them > inline. > > Cheers, > Matyas > > On Tue, Oct 14, 2025 at 3:02 AM Shengkai Fang <[email protected]> wrote: > > > > Hi, Matyas. > > > > Thanks for the proposal. I have some suggestions about the proposal. > > > > 1. I'm wondering whether we could extend the SQL API to change how Python > > models are loaded. For example, we could allow users to write: > > > > ``` > > CREATE MODEL my_pytorch_model > > WITH ( > > 'type' = 'pytorch' > > ) LANGUAGE PYTHON; > > ``` > > In this case, we wouldn't rely on Java SPI to load the Python model > > provider. However, I'm not sure whether Python has a similar mechanism to > > SPI that avoids hardcoding class paths. > > This is an interesting idea, however we are proposing using the > provider model because it aligns with Flink's existing Java-based > architecture for discovering plugins. A Java entry point is required > to launch the Python code, and this is the standard way to do it. > > > 2. Beam already supports TensorFlow, ONNX, and many built-in models. Can > we > > reuse Beam's utilities to build Flink prediction functions[1]? > > We can certainly learn from Beam's design, but directly reusing it > would add a very heavy dependency and be difficult to integrate > cleanly into Flink's native processing model. > > > 3. It would be better if we introduced a PredictRuntimeContext to help > > users download required weight files. > > This is actually a great idea and essential for usability. Just to > double check on your suggestion, your proposal is to have an explicit > PredictRuntimeContext for dynamic model file downloading? > > > > > 4. In ML, users typically perform inference on batches of data. > Therefore, > > per-record evaluation may not be necessary. How about we just introduce > API > > like[2]? > > I agree completely. The row-by-row API is just a starting point, and > we should aim to prioritize support for efficient batch inference to > ensure good performance for real-world models. > > > > Best, > > Shengkai > > > > [1] https://beam.apache.org/documentation/ml/about-ml/ > > [2] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation > > > > > > > > > > Swapna Marru <[email protected]> 于2025年10月14日周二 11:53写道: > > > > > Thanks Matyas. > > > > > > Hao, > > > > > > The proposal is to provide a generic framework . > > > Interfaces -> PythonPredictRuntimeProvider / PythonPredictFunction / > > > PredictFunction(in Python) are defined to provide a base for that > > > framework. > > > > > > generic-python is one of the implementations, registered similar to > openai > > > in original FLIP. > > > This is though not a concrete implementation end to end. It can be > used as, > > > 1. As a reference implementation for other complete end to end concrete > > > model provider implementations > > > 2. For simple python model implementations, this can be used out of > box to > > > avoid boilerplate java provider implementation. > > > > > > I will also open a PR with current implementation changes , so it's > more > > > clear for further discussion. > > > > > > -Thanks, > > > M.Swapna > > > > > > On Mon, Oct 13, 2025 at 5:04 PM Őrhidi Mátyás <[email protected] > > > > > wrote: > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers > > > > > > > > On Mon, Oct 13, 2025 at 4:10 PM Őrhidi Mátyás < > [email protected]> > > > > wrote: > > > > > > > > > > Swapna, I can help you to create a FLIP page. > > > > > > > > > > On Mon, Oct 13, 2025 at 3:58 PM Hao Li <[email protected]> > > > wrote: > > > > > > > > > > > > Hi Swapna, > > > > > > > > > > > > Thanks for the proposal. Can you put it in a FLIP and start a > > > > discussion > > > > > > thread for it? > > > > > > > > > > > > From an initial look, I'm a bit confused if this is a concrete > > > > > > implementation for "generic-python" or it's generic framework to > > > handle > > > > > > python predict function. Because everything seems concrete like > > > > > > `GenericPythonModelProviderFactory`, `GenericPythonModelProvider` > > > > exception > > > > > > the final Python predict function. > > > > > > > > > > > > Also if `GenericPythonModelProviderFactory` is predefined, do you > > > > predefine > > > > > > the required and optional options for it? Will it be inflexible > if > > > > > > predefined? > > > > > > > > > > > > Thanks, > > > > > > Hao > > > > > > > > > > > > On Mon, Oct 13, 2025 at 10:04 AM Swapna Marru < > > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > Hi ShengKai, > > > > > > > > > > > > > > Documented the initial proposal here , > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1YzBxLUPvluaZIvR0S3ktc5Be1FF4bNeTsXB9ILfgyWY/edit?usp=sharing > > > > > > > > > > > > > > Please review and let me know your thoughts. > > > > > > > > > > > > > > -Thanks, > > > > > > > Swapna > > > > > > > > > > > > > > On Tue, Sep 23, 2025 at 10:39 PM Shengkai Fang < > [email protected]> > > > > wrote: > > > > > > > > > > > > > > > I see your point, and I agree that your proposal is feasible. > > > > However, > > > > > > > > there is one limitation to consider: the current loading > > > mechanism > > > > first > > > > > > > > discovers all available factories on the classpath and then > > > > filters them > > > > > > > > based on the user-specified identifiers. > > > > > > > > > > > > > > > > In most practical scenarios, we would likely have only one > > > generic > > > > > > factory > > > > > > > > (e.g., a GenericPythonModelFactory) present in the classpath. > > > This > > > > means > > > > > > > > the framework would be able to load either PyTorch or > TensorFlow > > > > > > > > models—whichever is defined within that single generic > > > > > > implementation—but > > > > > > > > not both simultaneously unless additional mechanisms are > > > > introduced. > > > > > > > > > > > > > > > > This doesn't block the proposal, but it’s something worth > noting > > > > as we > > > > > > > > design the extensibility model. We may want to explore ways > to > > > > support > > > > > > > > multiple user-defined providers more seamlessly in the > future. > > > > > > > > > > > > > > > > Best, > > > > > > > > Shengkai > > > > > > > > > > > > > > > >
