mengw15 opened a new pull request, #5081: URL: https://github.com/apache/texera/pull/5081
## Summary - Schema-aware AI assistant embedded in the Python UDF Monaco editor — four interaction modes (ghost text, Cmd+K rewrite, Quick Fix on errors, side chat) plus a dataflow context panel. - AI sees the upstream schema, a real sample row, and the downstream consumer's expected schema — so suggestions reference actual column names and types instead of generic placeholders. - Auto-detects schema drift between UDF code and the operator's Extra Output Columns; one-click sync via regex fast-path or AI deep-analysis (handles add / remove / type-update / `retainInputColumns`). ## What's in this PR - Monaco integrations: `registerInlineCompletionsProvider` (ghost text + column dropdown), `addAction` (Cmd+K rewrite, Fix-with-AI), `registerCodeActionProvider` (Pyright lightbulb), side panel for chat. - New agent-service router under `/api/udf-copilot/`: `/complete`, `/chat`, `/rewrite`, `/fix`, `/sync-schema`, `/sample-capture`, `/sample-row`. Diagnose-then-fix prompt with 3-way classification (UDF code error vs API-contract violation vs framework error). Output validation + one-shot retry for known anti-patterns (`yield tuple_["x"]` scalar yield, `.items()` on Tuple). - Python worker hook (`amber/.../data_processor.py`): captures the first input tuple per UDF and asynchronously POSTs to agent-service so the AI gets real data even for workflows where no operator is paginated. - "Fix with AI" buttons on console + error panels that auto-open the editor and pre-fill the traceback. Cross-component flow via `UdfCopilotService.requestFixAndOpen`. - Reindent-after-Accept so AI output lands at the right indent level relative to the surrounding selection. ## Test plan - [ ] Open a Python UDF; type `tuple_["` — column-name dropdown appears with all upstream columns - [ ] Type after `tuple_["a"] > ` — ghost text suggests a value-aware threshold based on the sample row - [ ] Select a line, Cmd+K, "add None handling" — preview shows diff, Accept lands code at correct indent - [ ] Add `tuple_["foo"] = 1` — yellow banner shows `+ foo:integer`; Sync writes to Extra Output Columns - [ ] Remove that line — banner shows `− foo` (strikethrough); Sync removes from property panel - [ ] Click the 🔍 audit icon — AI-driven schema analysis runs and proposes `outputColumns` + `retainInputColumns` - [ ] Run a workflow with `tuple_.items()` bug; in Console tab click red "Fix with AI" button next to the error title; editor auto-opens, Fix overlay pre-filled with traceback, AI rewrites to `as_key_value_pairs()` - [ ] Side chat: ask "what columns do I have?" — AI quotes the real schema and sample values 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
