GitHub user mengw15 added a comment to the discussion: Task ideas for the 
dkNet-AI · Apache Texera Agent Hackathon

**UDF Copilot: a schema-aware AI inside the Python UDF editor**

Themes: Human-Agent Collaboration · Productivity Enhancement

**The gap.** Today, `agent-service` is great at building/editing whole 
workflows, and the editor has a single "Add Type Annotation" button. But once 
you start *writing* a UDF, you're on your own — no autocomplete that knows the 
upstream schema, no "fix this", no inline edits, no chat.

**The idea.** Bring a Cursor-style assistant into the Monaco UDF editor, with 
one thing GitHub Copilot can't have: **Texera context** — the upstream 
operator's output schema, a real sample row, the downstream consumer, and the 
UDF's API contract (`ProcessTupleOperator`, tuple vs table API).

**Four interaction modes** sharing that context:

1. **Schema-aware ghost text** — typing `tuple_["` suggests real column names 
from the upstream operator, not generic guesses.
2. **Cmd+K inline rewrite** — "add an `is_adult` column for age > 18" → AI 
rewrites the lines, shows a diff, Accept/Reject.
3. **Quick Fix on Pyright errors** — `KeyError: 'age'` → one-click fix that 
knows the schema actually has `user_age`.
4. **Side chat panel** — "why is this slow?", "convert tuple API to table API", 
"generate pytest with null edge cases", each with an "Apply to editor" button.

**Why it matters.** A large share of Texera's audience are domain experts 
(biology, social science, medicine) who write UDFs but aren't full-time 
programmers. A schema-aware copilot collapses the learning curve from "read 
pytexera docs first" to "just describe what you want, with real column names 
autocompleting."

**Implementation sketch.** Frontend: extend `code-editor.component.ts`, add 
Monaco `InlineCompletionsProvider`, register a Cmd+K widget, add a side panel. 
Backend: 4 new endpoints in `agent-service` reusing the Vercel AI SDK already 
wired up. Context plumbing: pull upstream schema from `WorkflowActionService` 
already in memory on the frontend.

**Demo.** Open a Python UDF after a CSVScan → ghost text suggests real columns 
→ Cmd+K rewrites a loop as vectorized pandas → force a schema typo → one-click 
Fix → side chat generates pytest. ~3 minutes end-to-end.

GitHub link: 
https://github.com/apache/texera/discussions/5059#discussioncomment-16924109

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to