mistercrunch commented on PR #32649: URL: https://github.com/apache/superset/pull/32649#issuecomment-3050606252
Worked with gpt as to what a foundational, backend-first extension could look like. Somehow feels like tightly coupling with `langgraph` (at least at this time) is a reasonable approach, more future-proof than langchain from my understanding. Anyhow, I think there are some good ideas in here: # 🧠 Superset Extension: `superset-extension-langgraph` A modular LLM framework for Superset, built on **LangGraph**, exposing secure and configurable APIs that the Superset UI (or external agents) can call. --- ## ✅ 1. Choose LangGraph as the Orchestration Engine - LangGraph gives us: - explicit state handling and branching logic - retries, failure handling, async support - composability (define base graphs, let users override parts) - Still compatible with LangChain tools and models. - Better suited for long-lived, stateful, production-ready agent workflows. --- ## 📦 2. Define `LLMConfiguration` Objects Analogous to a DB connection — stored in metadata DB, RBAC-controlled, API-exposed. ```python LLMConfiguration { id: UUID, name: str, provider: str, # "openai", "anthropic", "llama-cpp", etc. base_url: Optional[str], model: str, temperature: float, max_tokens: int, headers: Dict[str, str], default_prompt_context: str, metadata: JSON, created_by / modified_by: FK, permissions: [perm_id] } ``` Can be encrypted/stored securely like database credentials. --- ## 🔐 3. Define RBAC Permissions Use Superset's existing RBAC model to control access: - `can_use_llm_config[llm_config_name]` - `can_run_llm_graph[graph_name]` - Optional: `can_use_ai_on_dashboard`, `can_run_sql_via_ai`, etc. These permissions get attached to LLM config objects or graph identifiers. Admins decide which roles/users can use which LLM. --- ## ⚙️ 4. Support for Graph/Chain Configs - Define default LangGraph graphs (`dashboard_summary_graph`, `chart_question_graph`, etc.) - Let operators: - inject pre-prompts or override sections - disable/enable certain tools - alter text templates - Graphs can be registered via: - Python config (`graph_registry.py`) - YAML files (`graph_config.yaml`) - CLI/API in future --- ## 📡 5. New REST API Endpoints Expose endpoints to: - list available graphs per user - execute a specific graph with a selected LLM config - return **typed, schema-validated responses** ### Example: POST `/api/v1/llm/execute` ```json { "llm_config_id": "123", "graph": "dashboard_summary", "input": { "dashboard_id": 22, "question": "What’s happening with international orders in Canada?" } } ``` ### Response ```json { "type": "dashboard_summary_response", "summary": "...", "related_charts": [123, 456], "sql_suggestions": [...] } ``` Use `pydantic` + `JSONSchema` for validation and frontend integration. --- ## 🧩 6. Frontend Integration - Add generic `callLLM(configId, graphName, inputPayload)` helper - Register UI hooks per graph (e.g., “Summarize Dashboard” button) - Build a `<LLMPanel>` React component: - takes a graph ID - handles input/output schema - renders result, handles loading/errors --- ## 🔍 7. Observability & Debugging Track and optionally store: - user ID, graph name, config used - input + output (opt-in redaction) - token usage, latency, retries - success/failure state Store in `llm_graph_invocations` table or stream to logs/tracing tools (e.g., LangSmith, OpenTelemetry). --- ## 🔬 8. Local Dev & Testing - CLI tool: `superset llm test-graph dashboard_summary --input '{...}'` - Include mock data, unit test scaffolds - Allow extension devs to quickly test prompt variations + payload shape --- ## 💡 Summary This architecture: - Keeps Superset modular, secure, and flexible - Supports multiple backends and AI providers - Scales to many AI-powered features across the app - Aligns with the long-term vision of Superset as a platform (like VSCode) - Complements Superset’s **MCP** strategy, treating LLMs as clients rather than dependencies Want a working scaffold with stubs for the backend config, graph loader, and REST layer? Let me know. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org For additional commands, e-mail: notifications-h...@superset.apache.org