LRriver opened a new issue, #348:
URL: https://github.com/apache/hugegraph-ai/issues/348

   ### Search before asking
   
   - [x] I had searched in the 
[feature](https://github.com/apache/hugegraph-ai/issues?q=is%3Aissue+label%3A%22Feature%22)
 and found no similar feature requirement.
   
   
   ### Feature Description (功能描述)
   
   ## Feature Description
   
   HugeGraph-LLM has a graph extraction flow and the Gradio demo can call it, 
but
   there is no public REST endpoint for graph extraction. Please add a FastAPI
   endpoint that exposes graph extraction through the existing scheduler/flow
   boundary.
   
   This gives users a programmatic way to extract vertices and edges without 
using
   the demo UI.
   
   ## Current verification
   
   - The API router currently exposes `/rag`, `/rag/graph`, `/config/*`, and 
`/text2gremlin` in `hugegraph-llm/src/hugegraph_llm/api/rag_api.py`.
   - The FastAPI app registers `rag_http_api(...)` and `admin_http_api(...)`, 
but no graph extraction API is registered in 
`hugegraph-llm/src/hugegraph_llm/demo/rag_demo/app.py`.
   - The graph extraction flow already exists as `FlowName.GRAPH_EXTRACT` and 
is registered in `Scheduler` in 
`hugegraph-llm/src/hugegraph_llm/flows/scheduler.py`.
   - The demo helper calls the flow through `SchedulerSingleton`, so the 
endpoint should reuse that path rather than directly instantiating low-level 
operators.
   
   ## Suggested endpoint
   
   `POST /graph/extract`
   
   Suggested request fields:
   
   - `texts`: string or list of strings.
   - `schema`: graph schema JSON string or object, matching the existing flow 
expectations.
   - `example_prompt`: optional graph extraction prompt header.
   - `extract_type`: default `property_graph`.
   - `language`: default `zh`.
   - `split_type`: default `document`; valid values should match `ChunkSplit`.
   - `include_meta`: optional flag for chunk count, call count, and warnings.
   
   Suggested response fields:
   
   - `vertices`
   - `edges`
   - `warning`, when extraction returns no graph data or partial errors occur.
   - `meta`, when requested.
   
   ## Mermaid reference
   
   ```mermaid
   sequenceDiagram
       participant Client
       participant API as FastAPI /graph/extract
       participant Scheduler as SchedulerSingleton
       participant Flow as GraphExtractFlow
       participant Nodes as Schema + ChunkSplit + Extract nodes
   
       Client->>API: POST texts, schema, prompt, split_type
       API->>API: validate request
       API->>Scheduler: schedule_flow(FlowName.GRAPH_EXTRACT, ...)
       Scheduler->>Flow: prepare/build pipeline
       Flow->>Nodes: run graph extraction
       Nodes-->>Flow: vertices, edges, metadata
       Flow-->>Scheduler: post_deal result
       Scheduler-->>API: extraction result
       API-->>Client: JSON response
   ```
   
   ## Acceptance criteria
   
   - `POST /graph/extract` is available from the FastAPI app.
   - The endpoint uses `SchedulerSingleton` and `FlowName.GRAPH_EXTRACT`.
   - Request validation rejects empty text and invalid schema with 4xx errors.
   - If `schema` is accepted as an object, the API layer normalizes it to the 
JSON string shape expected by the current `SchemaNode`.
   - The response returns structured JSON, not a JSON-encoded string inside a 
string.
   - Existing demo graph extraction behavior remains unchanged.
   
   ## Suggested tests
   
   - Pydantic request model tests for valid and invalid inputs.
   - API test with a mocked scheduler/flow result.
   - Contract test that the endpoint returns `vertices` and `edges` as arrays.
   - Regression test that existing `/rag` and `/text2gremlin` endpoints still 
register.
   
   ## Dependencies
   
   - This can be implemented independently, but it should share the `split_type`
     contract from `01-configurable-graph-extract-chunk-split.md` if that task 
has
     landed.
   
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to