Kryst4lDem0ni4s opened a new issue, #185:
URL: https://github.com/apache/incubator-hugegraph-ai/issues/185

   ### Search before asking
   
   - [x] I had searched in the 
[feature](https://github.com/apache/incubator-hugegraph-ai/issues?q=is%3Aissue+label%3A%22Feature%22)
 and found no similar feature requirement.
   
   
   ### Feature Description (功能描述)
   
   Issue: 
   File Path: 
.\incubator-hugegraph-ai\hugegraph-llm\src\hugegraph_llm\resources\demo\config_prompt.yaml
 
   I have tested the current prompt against the following Large Language Models:
   
   > API:
   >   - GPT 4o mini
   >   - o3-mini
   > Local/Ollama:
   >   - tom_himanen/deepseek-r1-roo-cline-tools:1.5b
   >   - tom_himanen/deepseek-r1-roo-cline-tools:7b
   >   - deepseek-r1:7b
   >   - qwen2.5-coder:1.5b-base
   >   - DEEPSEEk-coder-v2:16b
   >   - deepseek-r1:14b
   
   The results after testing were very poor.
   The prompt does not clearly define the format requirements as per Apache 
Gremlin's documentation and can be made better through further testing and more 
prompt engineering.
   
   Example of outputs generated by the current prompt file:
   ```
    "vertices": [
       {
         "id": "1:person",
         "label": "person",
         "type": "vertex",
         "properties": {
           "name": "Sarah",
           "age": "30",
           "occupation": "attorney"
         }
       },
       {
         "id": "1:webpage",
         "label": "webpage",
         "type": "vertex",
         "properties": {
           "name": "www.sarahsplace.com",
           "url": "None"
         }
       }
     ],
     "edges": [
       {
         "label": "roommate",
         "type": "edge",
         "outV": "1:person",
         "outVLabel": "person",
         "inV": "1:webpage",
         "inVLabel": "webpage",
         "properties": {
           "date": "2010"
         }
       }
     ]
   }
   ```
   
   Why are these results incorrect (after numerous tests)?
   Errors related to missing keywords like "vertices", "edges", "edgelabels", 
"vertexlabels", "propertykeys", missing IDs, incorrect ID sequencing, missing 
"source_label" and "target_label", and other syntax errors.
   
   Expected syntax example for reference:
   
   ```
   {
     "vertices": [
       {
         "id": "2:lop",
         "label": "software",
         "type": "vertex",
         "properties": {
           "name": "lop",
           "lang": "java",
           "price": 328
         }
       },
       {
         "id": "1:josh",
         "label": "person",
         "type": "vertex",
         "properties": {
           "name": "josh",
           "age": 32,
           "city": "Beijing"
         }
       },
       {
         "id": "1:marko",
         "label": "person",
         "type": "vertex",
         "properties": {
           "name": "marko",
           "age": 29,
           "city": "Beijing"
         }
       },
       {
         "id": "1:peter",
         "label": "person",
         "type": "vertex",
         "properties": {
           "name": "peter",
           "age": 35,
           "city": "Shanghai"
         }
       },
       {
         "id": "1:vadas",
         "label": "person",
         "type": "vertex",
         "properties": {
           "name": "vadas",
           "age": 27,
           "city": "Hongkong"
         }
       },
       {
         "id": "2:ripple",
         "label": "software",
         "type": "vertex",
         "properties": {
           "name": "ripple",
           "lang": "java",
           "price": 199
         }
       }
     ],
     "edges": [
       {
         "id": "S1:josh>2>2>>S2:lop",
         "label": "created",
         "type": "edge",
         "outV": "1:josh",
         "outVLabel": "person",
         "inV": "2:lop",
         "inVLabel": "software",
         "properties": {
           "weight": 0.4,
           "date": "20091111"
         }
       },
       {
         "id": "S1:josh>2>2>>S2:ripple",
         "label": "created",
         "type": "edge",
         "outV": "1:josh",
         "outVLabel": "person",
         "inV": "2:ripple",
         "inVLabel": "software",
         "properties": {
           "weight": 1,
           "date": "20171210"
         }
       },
       {
         "id": "S1:marko>1>1>>S1:josh",
         "label": "knows",
         "type": "edge",
         "outV": "1:marko",
         "outVLabel": "person",
         "inV": "1:josh",
         "inVLabel": "person",
         "properties": {
           "weight": 1,
           "date": "20130220"
         }
       },
       {
         "id": "S1:marko>1>1>>S1:vadas",
         "label": "knows",
         "type": "edge",
         "outV": "1:marko",
         "outVLabel": "person",
         "inV": "1:vadas",
         "inVLabel": "person",
         "properties": {
           "weight": 0.5,
           "date": "20160110"
         }
       },
       {
         "id": "S1:marko>2>2>>S2:lop",
         "label": "created",
         "type": "edge",
         "outV": "1:marko",
         "outVLabel": "person",
         "inV": "2:lop",
         "inVLabel": "software",
         "properties": {
           "weight": 0.4,
           "date": "20171210"
         }
       },
       {
         "id": "S1:peter>2>2>>S2:lop",
         "label": "created",
         "type": "edge",
         "outV": "1:peter",
         "outVLabel": "person",
         "inV": "2:lop",
         "inVLabel": "software",
         "properties": {
           "weight": 0.2,
           "date": "20170324"
         }
       }
     ],
     "schema": {
       "vertexlabels": [
         {
           "id": 1,
           "name": "person",
           "id_strategy": "PRIMARY_KEY",
           "primary_keys": [
             "name"
           ],
           "properties": [
             "name",
             "age",
             "occupation"
           ],
           "nullable_keys": [
             "age",
             "occupation"
           ]
         },
         {
           "id": 2,
           "name": "webpage",
           "id_strategy": "PRIMARY_KEY",
           "primary_keys": [
             "name"
           ],
           "properties": [
             "name",
             "url"
           ],
           "nullable_keys": [
             "url"
           ]
         }
       ],
       "edgelabels": [
         {
           "id": 1,
           "name": "roommate",
           "source_label": "person",
           "target_label": "person",
           "properties": [
             "date"
           ]
         },
         {
           "id": 2,
           "name": "link",
           "source_label": "webpage",
           "target_label": "person",
           "properties": []
         }
       ],
       "propertykeys": [
         {
           "name": "name",
           "data_type": "TEXT",
           "cardinality": "SINGLE"
         },
         {
           "name": "age",
           "data_type": "TEXT",
           "cardinality": "SINGLE"
         },
         {
           "name": "occupation",
           "data_type": "TEXT",
           "cardinality": "SINGLE"
         },
         {
           "name": "url",
           "data_type": "TEXT",
           "cardinality": "SINGLE"
         },
         {
           "name": "date",
           "data_type": "TEXT",
           "cardinality": "SINGLE"
         }
       ]
     }
   }
   ```
   
   Note:
   The improvement of this process can be made in two iterations.
   1. Improving the prompts.
   2. Using a two step sequence (multi agent system for the complete json 
generation :
          - First step: generate vertices.
          - Second step: generate edges.
           Why?:
           Reduces the load on a single agent, decreasing generalization, 
especially while handling alarge context window.
   This is all with the understanding that the schema and property keys are 
automatically added when the vertices and edges are correctly generated..
   
   
   
   
   ### Are you willing to submit a PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@hugegraph.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to