purushah opened a new pull request, #852:
URL: https://github.com/apache/flink-agents/pull/852

   ## What is the purpose of the change
   
   Adds a **drop-in chat model that routes each request to the best underlying 
model**, then delegates to it. The router is a `CHAT_MODEL` resource, so an 
agent points at it by name with **no change to the runtime, events, or agent 
definition**.
   
   This is the *in-chat* selector (which LLM serves a single `chat()` call). A 
DataStream-level content-based agent router (branching records across agent 
operators) is a separate, follow-up concern.
   
   ## Brief change log
   
   - **`RoutingStrategy`** — pluggable selection SPI (`request -> candidate 
name`). Selection is a *pure* concern; returning `null` means "abstain / no 
opinion".
   - **`ChatModelRouter`** — orchestrates select → (optional cache) → validate 
→ delegate. A strategy that abstains (`null`) or names a non-candidate is a 
**routing miss** and degrades to the configured `default` candidate (validated 
at construction; defaults to the first candidate) rather than failing the 
request.
   - **`FallbackPolicy`** — optional: try remaining candidates on error.
   - **`CachingStrategy`** — optional bounded-LRU memoization of the decision 
per conversation, so an expensive strategy (e.g. an LLM judge) runs **once per 
conversation**, not once per tool-call round. Abstentions (`null`) are never 
cached.
   - Built-in strategies:
     - **`RuleBasedRoutingStrategy`** — deterministic keyword/regex rules + 
default.
     - **`LlmRoutingStrategy`** — a small "judge" model picks the candidate 
from each candidate's name/description (RouteLLM-style). Distinguishes a 
**transient judge failure** (abstain → retried next round, uncached) from an 
**unparseable reply** (deterministic default). Parses by whole-token match (no 
substring mis-routing, e.g. `gpt-4o-mini` won't match a `gpt-4` candidate).
   - **Bring-your-own** strategies are first-class: implement `RoutingStrategy` 
and reference it by fully-qualified class name; loaded via the thread context 
classloader (cluster-safe). ML/learned routing is supported the same way.
   - Adds `LlmRoutingAgentExample` and unit tests.
   
   ## Verifying this change
   
   This change adds tests and can be verified as follows:
   
   - Unit tests under `api/.../chat/model/routing/` covering rule selection, 
judge parsing (whole-token match), stickiness across tool-call rounds, 
fallback, caching (incl. abstain-not-cached), routing-miss degrade-to-default, 
and bring-your-own loading. All pass; `spotless:check` clean (JDK 17).
   
   ## Does this pull request potentially affect one of the following parts:
   
   - Dependencies (does it add or upgrade a dependency): **no**
   - The public API: **yes** — adds the 
`org.apache.flink.agents.api.chat.model.routing` package (additive; no existing 
API changed).
   - The serializers: **no**
   - The runtime per-record code paths: **no** (router is a `CHAT_MODEL` 
resource resolved by name)
   - Anything that affects deployment or recovery: **no** — preserves 
exactly-once / keyed-state / checkpoint semantics (no new operator, no nested 
invocation).
   
   ## Security note
   
   An LLM/ML routing decision is a **hint, not an authority** — the user's 
message is sent to the judge model, so a routing decision is susceptible to 
prompt injection. Cost/privilege/safety controls must not be gated solely on 
it. This is documented on `LlmRoutingStrategy`.
   
   ## Documentation
   
   - New public package is documented via javadoc on each type. Built-in 
strategies, the abstain/routing-miss contract, and the bring-your-own extension 
point are described on the SPI.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to