zyratlo opened a new pull request, #5260:
URL: https://github.com/apache/texera/pull/5260
### What changes were proposed in this PR?
Introduces the frontend LLM session class that converts a Jupyter notebook
into a Texera workflow JSON plus a bidirectional cell to operator mapping,
along with the prompt library it uses. Two files under
`frontend/src/app/workspace/service/notebook-migration/`, totalling ~700 lines
(~410 of which is prompt text).
**`migration-llm.ts`** — defines `NotebookMigrationLLM`, an `@Injectable`
class wrapping a Vercel AI SDK chat session against the LiteLLM proxy already
exposed on `main` at `/api/chat/completion`.
- `initialize(modelType, apiKey)` — builds an OpenAI-compatible chat
client via `createOpenAI({ baseURL: AppSettings.getApiEndpoint() })`, seeds the
message history with Texera documentation as `system` messages.
- `verifyConnection()` — does a 10-token `ping` call to validate that the
API key works against the configured model.
- `convertNotebookToWorkflow(notebook)` — extracts code cells (each tagged
with a UUID in `metadata.uuid`), sends `WORKFLOW_PROMPT` + the notebook to get
a JSON of UDF operators / edges, then sends `MAPPING_PROMPT` to get the
cell↔operator mapping. Assembles a complete Texera workflow JSON (`PythonUDFV2`
operators with stub input/output ports, links derived from the LLM's edge list,
default settings) plus a bidirectional `operator_to_cell` / `cell_to_operator`
mapping. Returns both as a JSON string.
- `close()` — clears the message history and the model reference.
**`migration-prompts.ts`** — string constants used by `migration-llm.ts`:
`TEXERA_OVERVIEW`, `TUPLE_DOCUMENTATION`, `TABLE_DOCUMENTATION`,
`OPERATOR_DOCUMENTATION`, `UDF_INPUT_PORT_DOCUMENTATION`,
`EXAMPLE_OF_GOOD_CONVERSION`, `VISUALIZER_DOCUMENTATION`,
`EXAMPLE_OF_MULTIPLE_UDF_CONVERSION`, `WORKFLOW_PROMPT`, `MAPPING_PROMPT`.
### Any related issues, documentation, discussions?
Closes #5259
Parent issue #4301
### How was this PR tested?
No unit tests were included for these reasons:
- A large portion of the changes are prompt text, which are not testable,
only readable. However the prompt text can be changed to improve the
performance of the LLM.
- Testing would require mocking a significant amount of logic that will be
introduced in later PRs, since the logic in `migration-llm.ts` is parsing a
response.
However I am open to writing tests based on review feedback.
### Was this PR authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.7)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]