Purshotam Shah created FLINK-39961:
--------------------------------------
Summary: Support routing across models in Flink SQL (ML_PREDICT)
Key: FLINK-39961
URL: https://issues.apache.org/jira/browse/FLINK-39961
Project: Flink
Issue Type: Improvement
Reporter: Purshotam Shah
*The problem*
FLIP-526 added model inference to Flink SQL (CREATE MODEL, ML_PREDICT,
ML_EVALUATE), but a query
is bound to a single, statically chosen model. There is no way to route a
request among several
candidate models from SQL — and no learned/ML-based way to make that choice.
Today selection has
to be hard-coded to one model or handled outside the query.
Why it's worth doing
- Cost and quality: automatically send simple requests to a small/cheap model
and hard ones to a
stronger model, instead of paying for the strongest model on every row or
fixing one model for
all traffic.
- SQL-native: keep the routing decision inside the query, so SQL users get it
without external
orchestration or dropping to a programmatic API.
- Scales better than static rules: a learned router adapts to the workload
rather than relying on
hand-written conditions that go stale.
- Builds directly on the existing model functions (FLIP-526 / ML_PREDICT)
rather than introducing
a parallel mechanism.
*What we plan to do*
- Let a set of candidate models plus a routing strategy be declared in SQL, and
have ML_PREDICT
pick the model per request rather than being pinned to one.
- Support multiple strategies: condition/rule-based selection and a learned
(ML) router that
scores the request and chooses a model.
- Reuse the existing ML_PREDICT execution path to invoke the chosen model, so
routing is a
selection layer on top of the current model functions, not a new inference
mechanism.
- Degrade gracefully — fall back to a configured default model when the router
can't decide or a
chosen model fails.
- Make the decision observable (which model served each request) via
metrics/logging.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)