Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Swapna Marru Sat, 18 Oct 2025 15:26:30 -0700

Thanks Matyas and Shengkai.


> 1. I'm wondering whether we could extend the SQL API to change how Python
> models are loaded. For example, we could allow users to write:
>
> ```
> CREATE MODEL my_pytorch_model
> WITH (
>    'type' = 'pytorch'
> ) LANGUAGE PYTHON;
> ```

Kind of thought about this, but one initial concern I had with this model
is,
Will a model provider be completely implemented in python itself ?
When we refer to PyTorch or HuggingFace or ONNX model providers for
example, do we need different behavior
or optimizations related to Predictruntimecontext/model config building ,
batching or resource scheduling decisions .. which needs to be done in Java
Flink entrypoints ?


> 2. Beam already supports TensorFlow, ONNX, and many built-in models. Can
we
> reuse Beam's utilities to build Flink prediction functions[1]?

Thanks, I will take a look at this to understand how it works and if we can
learn from that design.

> 3. It would be better if we introduced a PredictRuntimeContext to help
> users download required weight files.

Currently I have this as Model Config(set_model_config) , where this will
be passed to PredictFunction in python. But PredictRuntimeContext seems
more suitable .
I will look into passing PredictRuntimeContext in open, similar to
RuntimeContext for UDF's.


> 4. In ML, users typically perform inference on batches of data. Therefore,
> per-record evaluation may not be necessary. How about we just introduce
API
> like[2]?

Yes, I completely agree on this. I was first aiming at agreement on model
creation and provider api and then look at this in more detail.

On Tue, Oct 14, 2025 at 10:39 AM Őrhidi Mátyás <[email protected]>
wrote:

> Hey Shengkai,
>
> Thank you for your observations. This proposal is mostly driven by
> Swapna, but I could also share my thoughts here, please find them
> inline.
>
> Cheers,
> Matyas
>
> On Tue, Oct 14, 2025 at 3:02 AM Shengkai Fang <[email protected]> wrote:
> >
> > Hi, Matyas.
> >
> > Thanks for the proposal.  I have some suggestions about the proposal.
> >
> > 1. I'm wondering whether we could extend the SQL API to change how Python
> > models are loaded. For example, we could allow users to write:
> >
> > ```
> > CREATE MODEL my_pytorch_model
> > WITH (
> >    'type' = 'pytorch'
> > ) LANGUAGE PYTHON;
> > ```
> > In this case, we wouldn't rely on Java SPI to load the Python model
> > provider. However, I'm not sure whether Python has a similar mechanism to
> > SPI that avoids hardcoding class paths.
>
> This is an interesting idea, however we are proposing using the
> provider model because it aligns with Flink's existing Java-based
> architecture for discovering plugins. A Java entry point is required
> to launch the Python code, and this is the standard way to do it.
>
> > 2. Beam already supports TensorFlow, ONNX, and many built-in models. Can
> we
> > reuse Beam's utilities to build Flink prediction functions[1]?
>
> We can certainly learn from Beam's design, but directly reusing it
> would add a very heavy dependency and be difficult to integrate
> cleanly into Flink's native processing model.
>
> > 3. It would be better if we introduced a PredictRuntimeContext to help
> > users download required weight files.
>
> This is actually a great idea and essential for usability. Just to
> double check on your suggestion, your proposal is to have an explicit
> PredictRuntimeContext for dynamic model file downloading?
>
> >
> > 4. In ML, users typically perform inference on batches of data.
> Therefore,
> > per-record evaluation may not be necessary. How about we just introduce
> API
> > like[2]?
>
> I agree completely. The row-by-row API is just a starting point, and
> we should aim to prioritize support for efficient batch inference to
> ensure good performance for real-world models.
> >
> > Best,
> > Shengkai
> >
> > [1] https://beam.apache.org/documentation/ml/about-ml/
> > [2]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation
> >
> >
> >
> >
> > Swapna Marru <[email protected]> 于2025年10月14日周二 11:53写道：
> >
> > > Thanks Matyas.
> > >
> > > Hao,
> > >
> > > The proposal is to provide a generic framework .
> > > Interfaces ->  PythonPredictRuntimeProvider / PythonPredictFunction /
> > > PredictFunction(in Python) are defined to provide a base for that
> > > framework.
> > >
> > > generic-python is one of the implementations, registered similar to
> openai
> > > in original FLIP.
> > > This is though not a concrete implementation end to end. It can be
> used as,
> > > 1. As a reference implementation for other complete end to end concrete
> > > model provider implementations
> > > 2. For simple python model implementations, this can be used out of
> box to
> > > avoid boilerplate java provider implementation.
> > >
> > > I will also open a PR with current implementation changes , so it's
> more
> > > clear for further discussion.
> > >
> > > -Thanks,
> > > M.Swapna
> > >
> > > On Mon, Oct 13, 2025 at 5:04 PM Őrhidi Mátyás <[email protected]
> >
> > > wrote:
> > >
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers
> > > >
> > > > On Mon, Oct 13, 2025 at 4:10 PM Őrhidi Mátyás <
> [email protected]>
> > > > wrote:
> > > > >
> > > > > Swapna, I can help you to create a FLIP page.
> > > > >
> > > > > On Mon, Oct 13, 2025 at 3:58 PM Hao Li <[email protected]>
> > > wrote:
> > > > > >
> > > > > > Hi Swapna,
> > > > > >
> > > > > > Thanks for the proposal. Can you put it in a FLIP and start a
> > > > discussion
> > > > > > thread for it?
> > > > > >
> > > > > > From an initial look, I'm a bit confused if this is a concrete
> > > > > > implementation for "generic-python" or it's generic framework to
> > > handle
> > > > > > python predict function. Because everything seems concrete like
> > > > > > `GenericPythonModelProviderFactory`, `GenericPythonModelProvider`
> > > > exception
> > > > > > the final Python predict function.
> > > > > >
> > > > > > Also if `GenericPythonModelProviderFactory` is predefined, do you
> > > > predefine
> > > > > > the required and optional options for it? Will it be inflexible
> if
> > > > > > predefined?
> > > > > >
> > > > > > Thanks,
> > > > > > Hao
> > > > > >
> > > > > > On Mon, Oct 13, 2025 at 10:04 AM Swapna Marru <
> > > > [email protected]>
> > > > > > wrote:
> > > > > > >
> > > > > > > Hi ShengKai,
> > > > > > >
> > > > > > > Documented the initial proposal here ,
> > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> https://docs.google.com/document/d/1YzBxLUPvluaZIvR0S3ktc5Be1FF4bNeTsXB9ILfgyWY/edit?usp=sharing
> > > > > > >
> > > > > > > Please review and let me know your thoughts.
> > > > > > >
> > > > > > > -Thanks,
> > > > > > > Swapna
> > > > > > >
> > > > > > > On Tue, Sep 23, 2025 at 10:39 PM Shengkai Fang <
> [email protected]>
> > > > wrote:
> > > > > > >
> > > > > > > > I see your point, and I agree that your proposal is feasible.
> > > > However,
> > > > > > > > there is one limitation to consider: the current loading
> > > mechanism
> > > > first
> > > > > > > > discovers all available factories on the classpath and then
> > > > filters them
> > > > > > > > based on the user-specified identifiers.
> > > > > > > >
> > > > > > > > In most practical scenarios, we would likely have only one
> > > generic
> > > > > > factory
> > > > > > > > (e.g., a GenericPythonModelFactory) present in the classpath.
> > > This
> > > > means
> > > > > > > > the framework would be able to load either PyTorch or
> TensorFlow
> > > > > > > > models—whichever is defined within that single generic
> > > > > > implementation—but
> > > > > > > > not both simultaneously unless additional mechanisms are
> > > > introduced.
> > > > > > > >
> > > > > > > > This doesn't block the proposal, but it’s something worth
> noting
> > > > as we
> > > > > > > > design the extensibility model. We may want to explore ways
> to
> > > > support
> > > > > > > > multiple user-defined providers more seamlessly in the
> future.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Shengkai
> > > > > > > >
> > > >
> > >
>

Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Reply via email to