Purshotam Shah created FLINK-39961:
--------------------------------------

             Summary:  Support routing across models in Flink SQL (ML_PREDICT)
                 Key: FLINK-39961
                 URL: https://issues.apache.org/jira/browse/FLINK-39961
             Project: Flink
          Issue Type: Improvement
            Reporter: Purshotam Shah


*The problem*

FLIP-526 added model inference to Flink SQL (CREATE MODEL, ML_PREDICT, 
ML_EVALUATE), but a query
is bound to a single, statically chosen model. There is no way to route a 
request among several
candidate models from SQL — and no learned/ML-based way to make that choice. 
Today selection has
to be hard-coded to one model or handled outside the query.

Why it's worth doing

- Cost and quality: automatically send simple requests to a small/cheap model 
and hard ones to a
  stronger model, instead of paying for the strongest model on every row or 
fixing one model for
  all traffic.
- SQL-native: keep the routing decision inside the query, so SQL users get it 
without external
  orchestration or dropping to a programmatic API.
- Scales better than static rules: a learned router adapts to the workload 
rather than relying on
  hand-written conditions that go stale.
- Builds directly on the existing model functions (FLIP-526 / ML_PREDICT) 
rather than introducing
  a parallel mechanism.

*What we plan to do*

- Let a set of candidate models plus a routing strategy be declared in SQL, and 
have ML_PREDICT
  pick the model per request rather than being pinned to one.
- Support multiple strategies: condition/rule-based selection and a learned 
(ML) router that
  scores the request and chooses a model.
- Reuse the existing ML_PREDICT execution path to invoke the chosen model, so 
routing is a
  selection layer on top of the current model functions, not a new inference 
mechanism.
- Degrade gracefully — fall back to a configured default model when the router 
can't decide or a
  chosen model fails.
- Make the decision observable (which model served each request) via 
metrics/logging.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to