Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Shengkai Fang Tue, 14 Oct 2025 20:54:07 -0700

Hi Matyas and Swapna,

Please see my response inline.


>This is an interesting idea, however we are proposing using the
> provider model because it aligns with Flink's existing Java-based
> architecture for discovering plugins. A Java entry point is required
> to launch the Python code, and this is the standard way to do it.

A Java entry point is required. However, I don't mean that every Python
model implementation needs its own Java entry point. Instead, we could use
a generic Java entry point to start the Python process, and let this entry
point discover available Python model implementations dynamically.
In this way, model developers would only need to focus on their Python
code, without having to write a Java factory wrapper.

> We can certainly learn from Beam's design, but directly reusing it
> would add a very heavy dependency and be difficult to integrate
> cleanly into Flink's native processing model.

Flink already uses Apache Beam in PyFlink. For example, you can see
pyflink-udf-runner.sh, which is the startup script used to launch the
Python process at runtime.

>This is actually a great idea and essential for usability. Just to
> double check on your suggestion, your proposal is to have an explicit
> PredictRuntimeContext for dynamic model file downloading?

Yes! I think passing PredictRuntimeContext in open is a good idea.

Best,
Shengkai


Hao Li <[email protected]> 于2025年10月15日周三 07:08写道：

> Hi Swapna,
>
> Some other suggestion and questions:
>
> 1. Maybe change `getPythonPredictFunction` to `getPythonPredictFunction` to
> align with `createPredictFunction` in `PredictRuntimeProvider`.
> 2. Is it possible to enforce the returned Java `PythonFunction` from
> `createPythonFunction` implements Python `PredictFunction`?
>
> Hao
>
> On Tue, Oct 14, 2025 at 2:52 PM Swapna Marru <[email protected]>
> wrote:
>
> > Thanks Matyas and Shengkai.
> >
> >
> > > 1. I'm wondering whether we could extend the SQL API to change how
> Python
> > > models are loaded. For example, we could allow users to write:
> > >
> > > ```
> > > CREATE MODEL my_pytorch_model
> > > WITH (
> > >    'type' = 'pytorch'
> > > ) LANGUAGE PYTHON;
> > > ```
> >
> > Kind of thought about this, but one initial concern I had with this model
> > is,
> > Will a model provider be completely implemented in python itself ?
> > When we refer to PyTorch or HuggingFace or ONNX model providers for
> > example, do we need different behavior
> > or optimizations related to Predictruntimecontext/model config building ,
> > batching or resource scheduling decisions .. which needs to be done in
> Java
> > Flink entrypoints ?
> >
> >
> > > 2. Beam already supports TensorFlow, ONNX, and many built-in models.
> Can
> > we
> > > reuse Beam's utilities to build Flink prediction functions[1]?
> >
> > Thanks, I will take a look at this to understand how it works and if we
> can
> > learn from that design.
> >
> > > 3. It would be better if we introduced a PredictRuntimeContext to help
> > > users download required weight files.
> >
> > Currently I have this as Model Config(set_model_config) , where this will
> > be passed to PredictFunction in python. But PredictRuntimeContext seems
> > more suitable .
> > I will look into passing PredictRuntimeContext in open, similar to
> > RuntimeContext for UDF's.
> >
> >
> > > 4. In ML, users typically perform inference on batches of data.
> > Therefore,
> > > per-record evaluation may not be necessary. How about we just introduce
> > API
> > > like[2]?
> >
> > Yes, I completely agree on this. I was first aiming at agreement on model
> > creation and provider api and then look at this in more detail.
> >
> > On Tue, Oct 14, 2025 at 10:39 AM Őrhidi Mátyás <[email protected]>
> > wrote:
> >
> > > Hey Shengkai,
> > >
> > > Thank you for your observations. This proposal is mostly driven by
> > > Swapna, but I could also share my thoughts here, please find them
> > > inline.
> > >
> > > Cheers,
> > > Matyas
> > >
> > > On Tue, Oct 14, 2025 at 3:02 AM Shengkai Fang <[email protected]>
> wrote:
> > > >
> > > > Hi, Matyas.
> > > >
> > > > Thanks for the proposal.  I have some suggestions about the proposal.
> > > >
> > > > 1. I'm wondering whether we could extend the SQL API to change how
> > Python
> > > > models are loaded. For example, we could allow users to write:
> > > >
> > > > ```
> > > > CREATE MODEL my_pytorch_model
> > > > WITH (
> > > >    'type' = 'pytorch'
> > > > ) LANGUAGE PYTHON;
> > > > ```
> > > > In this case, we wouldn't rely on Java SPI to load the Python model
> > > > provider. However, I'm not sure whether Python has a similar
> mechanism
> > to
> > > > SPI that avoids hardcoding class paths.
> > >
> > > This is an interesting idea, however we are proposing using the
> > > provider model because it aligns with Flink's existing Java-based
> > > architecture for discovering plugins. A Java entry point is required
> > > to launch the Python code, and this is the standard way to do it.
> > >
> > > > 2. Beam already supports TensorFlow, ONNX, and many built-in models.
> > Can
> > > we
> > > > reuse Beam's utilities to build Flink prediction functions[1]?
> > >
> > > We can certainly learn from Beam's design, but directly reusing it
> > > would add a very heavy dependency and be difficult to integrate
> > > cleanly into Flink's native processing model.
> > >
> > > > 3. It would be better if we introduced a PredictRuntimeContext to
> help
> > > > users download required weight files.
> > >
> > > This is actually a great idea and essential for usability. Just to
> > > double check on your suggestion, your proposal is to have an explicit
> > > PredictRuntimeContext for dynamic model file downloading?
> > >
> > > >
> > > > 4. In ML, users typically perform inference on batches of data.
> > > Therefore,
> > > > per-record evaluation may not be necessary. How about we just
> introduce
> > > API
> > > > like[2]?
> > >
> > > I agree completely. The row-by-row API is just a starting point, and
> > > we should aim to prioritize support for efficient batch inference to
> > > ensure good performance for real-world models.
> > > >
> > > > Best,
> > > > Shengkai
> > > >
> > > > [1] https://beam.apache.org/documentation/ml/about-ml/
> > > > [2]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation
> > > >
> > > >
> > > >
> > > >
> > > > Swapna Marru <[email protected]> 于2025年10月14日周二 11:53写道：
> > > >
> > > > > Thanks Matyas.
> > > > >
> > > > > Hao,
> > > > >
> > > > > The proposal is to provide a generic framework .
> > > > > Interfaces ->  PythonPredictRuntimeProvider /
> PythonPredictFunction /
> > > > > PredictFunction(in Python) are defined to provide a base for that
> > > > > framework.
> > > > >
> > > > > generic-python is one of the implementations, registered similar to
> > > openai
> > > > > in original FLIP.
> > > > > This is though not a concrete implementation end to end. It can be
> > > used as,
> > > > > 1. As a reference implementation for other complete end to end
> > concrete
> > > > > model provider implementations
> > > > > 2. For simple python model implementations, this can be used out of
> > > box to
> > > > > avoid boilerplate java provider implementation.
> > > > >
> > > > > I will also open a PR with current implementation changes , so it's
> > > more
> > > > > clear for further discussion.
> > > > >
> > > > > -Thanks,
> > > > > M.Swapna
> > > > >
> > > > > On Mon, Oct 13, 2025 at 5:04 PM Őrhidi Mátyás <
> > [email protected]
> > > >
> > > > > wrote:
> > > > >
> > > > > >
> > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers
> > > > > >
> > > > > > On Mon, Oct 13, 2025 at 4:10 PM Őrhidi Mátyás <
> > > [email protected]>
> > > > > > wrote:
> > > > > > >
> > > > > > > Swapna, I can help you to create a FLIP page.
> > > > > > >
> > > > > > > On Mon, Oct 13, 2025 at 3:58 PM Hao Li
> <[email protected]
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > Hi Swapna,
> > > > > > > >
> > > > > > > > Thanks for the proposal. Can you put it in a FLIP and start a
> > > > > > discussion
> > > > > > > > thread for it?
> > > > > > > >
> > > > > > > > From an initial look, I'm a bit confused if this is a
> concrete
> > > > > > > > implementation for "generic-python" or it's generic framework
> > to
> > > > > handle
> > > > > > > > python predict function. Because everything seems concrete
> like
> > > > > > > > `GenericPythonModelProviderFactory`,
> > `GenericPythonModelProvider`
> > > > > > exception
> > > > > > > > the final Python predict function.
> > > > > > > >
> > > > > > > > Also if `GenericPythonModelProviderFactory` is predefined, do
> > you
> > > > > > predefine
> > > > > > > > the required and optional options for it? Will it be
> inflexible
> > > if
> > > > > > > > predefined?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Hao
> > > > > > > >
> > > > > > > > On Mon, Oct 13, 2025 at 10:04 AM Swapna Marru <
> > > > > > [email protected]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Hi ShengKai,
> > > > > > > > >
> > > > > > > > > Documented the initial proposal here ,
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > >
> > > > >
> > >
> >
> https://docs.google.com/document/d/1YzBxLUPvluaZIvR0S3ktc5Be1FF4bNeTsXB9ILfgyWY/edit?usp=sharing
> > > > > > > > >
> > > > > > > > > Please review and let me know your thoughts.
> > > > > > > > >
> > > > > > > > > -Thanks,
> > > > > > > > > Swapna
> > > > > > > > >
> > > > > > > > > On Tue, Sep 23, 2025 at 10:39 PM Shengkai Fang <
> > > [email protected]>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I see your point, and I agree that your proposal is
> > feasible.
> > > > > > However,
> > > > > > > > > > there is one limitation to consider: the current loading
> > > > > mechanism
> > > > > > first
> > > > > > > > > > discovers all available factories on the classpath and
> then
> > > > > > filters them
> > > > > > > > > > based on the user-specified identifiers.
> > > > > > > > > >
> > > > > > > > > > In most practical scenarios, we would likely have only
> one
> > > > > generic
> > > > > > > > factory
> > > > > > > > > > (e.g., a GenericPythonModelFactory) present in the
> > classpath.
> > > > > This
> > > > > > means
> > > > > > > > > > the framework would be able to load either PyTorch or
> > > TensorFlow
> > > > > > > > > > models—whichever is defined within that single generic
> > > > > > > > implementation—but
> > > > > > > > > > not both simultaneously unless additional mechanisms are
> > > > > > introduced.
> > > > > > > > > >
> > > > > > > > > > This doesn't block the proposal, but it’s something worth
> > > noting
> > > > > > as we
> > > > > > > > > > design the extensibility model. We may want to explore
> ways
> > > to
> > > > > > support
> > > > > > > > > > multiple user-defined providers more seamlessly in the
> > > future.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Shengkai
> > > > > > > > > >
> > > > > >
> > > > >
> > >
> >
>

Re: [Question] use ML_PREDICT/ML_EVALUATE with python model provider in Flink

Reply via email to