Hi Swapna,

Some other suggestion and questions:

1. Maybe change `getPythonPredictFunction` to `getPythonPredictFunction` to
align with `createPredictFunction` in `PredictRuntimeProvider`.
2. Is it possible to enforce the returned Java `PythonFunction` from
`createPythonFunction` implements Python `PredictFunction`?

Hao

On Tue, Oct 14, 2025 at 2:52 PM Swapna Marru <[email protected]>
wrote:

> Thanks Matyas and Shengkai.
>
>
> > 1. I'm wondering whether we could extend the SQL API to change how Python
> > models are loaded. For example, we could allow users to write:
> >
> > ```
> > CREATE MODEL my_pytorch_model
> > WITH (
> >    'type' = 'pytorch'
> > ) LANGUAGE PYTHON;
> > ```
>
> Kind of thought about this, but one initial concern I had with this model
> is,
> Will a model provider be completely implemented in python itself ?
> When we refer to PyTorch or HuggingFace or ONNX model providers for
> example, do we need different behavior
> or optimizations related to Predictruntimecontext/model config building ,
> batching or resource scheduling decisions .. which needs to be done in Java
> Flink entrypoints ?
>
>
> > 2. Beam already supports TensorFlow, ONNX, and many built-in models. Can
> we
> > reuse Beam's utilities to build Flink prediction functions[1]?
>
> Thanks, I will take a look at this to understand how it works and if we can
> learn from that design.
>
> > 3. It would be better if we introduced a PredictRuntimeContext to help
> > users download required weight files.
>
> Currently I have this as Model Config(set_model_config) , where this will
> be passed to PredictFunction in python. But PredictRuntimeContext seems
> more suitable .
> I will look into passing PredictRuntimeContext in open, similar to
> RuntimeContext for UDF's.
>
>
> > 4. In ML, users typically perform inference on batches of data.
> Therefore,
> > per-record evaluation may not be necessary. How about we just introduce
> API
> > like[2]?
>
> Yes, I completely agree on this. I was first aiming at agreement on model
> creation and provider api and then look at this in more detail.
>
> On Tue, Oct 14, 2025 at 10:39 AM Őrhidi Mátyás <[email protected]>
> wrote:
>
> > Hey Shengkai,
> >
> > Thank you for your observations. This proposal is mostly driven by
> > Swapna, but I could also share my thoughts here, please find them
> > inline.
> >
> > Cheers,
> > Matyas
> >
> > On Tue, Oct 14, 2025 at 3:02 AM Shengkai Fang <[email protected]> wrote:
> > >
> > > Hi, Matyas.
> > >
> > > Thanks for the proposal.  I have some suggestions about the proposal.
> > >
> > > 1. I'm wondering whether we could extend the SQL API to change how
> Python
> > > models are loaded. For example, we could allow users to write:
> > >
> > > ```
> > > CREATE MODEL my_pytorch_model
> > > WITH (
> > >    'type' = 'pytorch'
> > > ) LANGUAGE PYTHON;
> > > ```
> > > In this case, we wouldn't rely on Java SPI to load the Python model
> > > provider. However, I'm not sure whether Python has a similar mechanism
> to
> > > SPI that avoids hardcoding class paths.
> >
> > This is an interesting idea, however we are proposing using the
> > provider model because it aligns with Flink's existing Java-based
> > architecture for discovering plugins. A Java entry point is required
> > to launch the Python code, and this is the standard way to do it.
> >
> > > 2. Beam already supports TensorFlow, ONNX, and many built-in models.
> Can
> > we
> > > reuse Beam's utilities to build Flink prediction functions[1]?
> >
> > We can certainly learn from Beam's design, but directly reusing it
> > would add a very heavy dependency and be difficult to integrate
> > cleanly into Flink's native processing model.
> >
> > > 3. It would be better if we introduced a PredictRuntimeContext to help
> > > users download required weight files.
> >
> > This is actually a great idea and essential for usability. Just to
> > double check on your suggestion, your proposal is to have an explicit
> > PredictRuntimeContext for dynamic model file downloading?
> >
> > >
> > > 4. In ML, users typically perform inference on batches of data.
> > Therefore,
> > > per-record evaluation may not be necessary. How about we just introduce
> > API
> > > like[2]?
> >
> > I agree completely. The row-by-row API is just a starting point, and
> > we should aim to prioritize support for efficient batch inference to
> > ensure good performance for real-world models.
> > >
> > > Best,
> > > Shengkai
> > >
> > > [1] https://beam.apache.org/documentation/ml/about-ml/
> > > [2]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-491%3A+BundledAggregateFunction+for+batched+aggregation
> > >
> > >
> > >
> > >
> > > Swapna Marru <[email protected]> 于2025年10月14日周二 11:53写道:
> > >
> > > > Thanks Matyas.
> > > >
> > > > Hao,
> > > >
> > > > The proposal is to provide a generic framework .
> > > > Interfaces ->  PythonPredictRuntimeProvider / PythonPredictFunction /
> > > > PredictFunction(in Python) are defined to provide a base for that
> > > > framework.
> > > >
> > > > generic-python is one of the implementations, registered similar to
> > openai
> > > > in original FLIP.
> > > > This is though not a concrete implementation end to end. It can be
> > used as,
> > > > 1. As a reference implementation for other complete end to end
> concrete
> > > > model provider implementations
> > > > 2. For simple python model implementations, this can be used out of
> > box to
> > > > avoid boilerplate java provider implementation.
> > > >
> > > > I will also open a PR with current implementation changes , so it's
> > more
> > > > clear for further discussion.
> > > >
> > > > -Thanks,
> > > > M.Swapna
> > > >
> > > > On Mon, Oct 13, 2025 at 5:04 PM Őrhidi Mátyás <
> [email protected]
> > >
> > > > wrote:
> > > >
> > > > >
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-552+Support+ML_PREDICT+for+Python+based+model+providers
> > > > >
> > > > > On Mon, Oct 13, 2025 at 4:10 PM Őrhidi Mátyás <
> > [email protected]>
> > > > > wrote:
> > > > > >
> > > > > > Swapna, I can help you to create a FLIP page.
> > > > > >
> > > > > > On Mon, Oct 13, 2025 at 3:58 PM Hao Li <[email protected]
> >
> > > > wrote:
> > > > > > >
> > > > > > > Hi Swapna,
> > > > > > >
> > > > > > > Thanks for the proposal. Can you put it in a FLIP and start a
> > > > > discussion
> > > > > > > thread for it?
> > > > > > >
> > > > > > > From an initial look, I'm a bit confused if this is a concrete
> > > > > > > implementation for "generic-python" or it's generic framework
> to
> > > > handle
> > > > > > > python predict function. Because everything seems concrete like
> > > > > > > `GenericPythonModelProviderFactory`,
> `GenericPythonModelProvider`
> > > > > exception
> > > > > > > the final Python predict function.
> > > > > > >
> > > > > > > Also if `GenericPythonModelProviderFactory` is predefined, do
> you
> > > > > predefine
> > > > > > > the required and optional options for it? Will it be inflexible
> > if
> > > > > > > predefined?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Hao
> > > > > > >
> > > > > > > On Mon, Oct 13, 2025 at 10:04 AM Swapna Marru <
> > > > > [email protected]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > Hi ShengKai,
> > > > > > > >
> > > > > > > > Documented the initial proposal here ,
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > > >
> >
> https://docs.google.com/document/d/1YzBxLUPvluaZIvR0S3ktc5Be1FF4bNeTsXB9ILfgyWY/edit?usp=sharing
> > > > > > > >
> > > > > > > > Please review and let me know your thoughts.
> > > > > > > >
> > > > > > > > -Thanks,
> > > > > > > > Swapna
> > > > > > > >
> > > > > > > > On Tue, Sep 23, 2025 at 10:39 PM Shengkai Fang <
> > [email protected]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > I see your point, and I agree that your proposal is
> feasible.
> > > > > However,
> > > > > > > > > there is one limitation to consider: the current loading
> > > > mechanism
> > > > > first
> > > > > > > > > discovers all available factories on the classpath and then
> > > > > filters them
> > > > > > > > > based on the user-specified identifiers.
> > > > > > > > >
> > > > > > > > > In most practical scenarios, we would likely have only one
> > > > generic
> > > > > > > factory
> > > > > > > > > (e.g., a GenericPythonModelFactory) present in the
> classpath.
> > > > This
> > > > > means
> > > > > > > > > the framework would be able to load either PyTorch or
> > TensorFlow
> > > > > > > > > models—whichever is defined within that single generic
> > > > > > > implementation—but
> > > > > > > > > not both simultaneously unless additional mechanisms are
> > > > > introduced.
> > > > > > > > >
> > > > > > > > > This doesn't block the proposal, but it’s something worth
> > noting
> > > > > as we
> > > > > > > > > design the extensibility model. We may want to explore ways
> > to
> > > > > support
> > > > > > > > > multiple user-defined providers more seamlessly in the
> > future.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Shengkai
> > > > > > > > >
> > > > >
> > > >
> >
>

Reply via email to