GitHub user dosubot[bot] edited a comment on the discussion: 关于电网拓扑数据导入
<!-- Answer -->
你的导入“无报错但部分设备没有连接关系”,根本原因是 struct_1.json 的 edges 配置 field_mapping 写成了 `"from":
"id", "to": "id"`,而不是 `"from": "from", "to": "to"`。这样 loader 会把 edge CSV 的 id
列当作边属性导入,导致边无法正确关联顶点,或者被静默跳过,最终部分设备没有连接关系(如 BDZ-3002、XLD-3445、XLD-3380 等)。
**具体问题分析:**
1. **field_mapping 错误**:必须是 `"from": "from", "to": "to"`,否则 loader 取不到顶点主键
id,边全部失效。
2. **边标签 .properties() 必须声明实际用到的属性**,如有 source_label/target_label 就要写上,否则
loader 也会报错或跳过。
3. **CSV 文件有重复行**:如 vertex_linesegment-1.csv、vertex_stationhouse-1.csv 等有多行重复
id,会导致顶点覆盖或边关联异常。
4. **部分设备没有连接关系**:因为边数据和顶点数据的 id 不一致、重复、空行或 field_mapping 错误,导致边无法关联到这些设备。
**修正建议:**
### 1. schema_1.groovy(只保留必要属性,边标签声明 source_label/target_label)
```groovy
// 属性定义
schema.propertyKey("id").asText().ifNotExist().create();
schema.propertyKey("name").asText().ifNotExist().create();
schema.propertyKey("objectType").asText().ifNotExist().create();
schema.propertyKey("objectHandle").asText().ifNotExist().create();
schema.propertyKey("deviceType").asText().ifNotExist().create();
schema.propertyKey("internalEndpointNo").asInt().ifNotExist().create();
schema.propertyKey("terminalNo").asInt().ifNotExist().create();
schema.propertyKey("usage").asText().ifNotExist().create();
schema.propertyKey("powerFlowDirection").asInt().ifNotExist().create();
schema.propertyKey("belongSubstation").asText().ifNotExist().create();
schema.propertyKey("belongFeeder").asText().ifNotExist().create();
schema.propertyKey("source_label").asText().ifNotExist().create();
schema.propertyKey("target_label").asText().ifNotExist().create();
// 顶点标签
schema.vertexLabel("Substation").properties("id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation",
"belongFeeder").primaryKeys("id").ifNotExist().create();
schema.vertexLabel("LineSegment").properties("id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation",
"belongFeeder").primaryKeys("id").ifNotExist().create();
schema.vertexLabel("LineSwitch").properties("id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation",
"belongFeeder").primaryKeys("id").ifNotExist().create();
schema.vertexLabel("StationHouse").properties("id", "name", "objectType",
"objectHandle", "deviceType", "internalEndpointNo", "terminalNo", "usage",
"powerFlowDirection", "belongSubstation",
"belongFeeder").primaryKeys("id").ifNotExist().create();
// 边标签(每种类型对每种类型都定义一条边,便于扩展和双向连接)
schema.edgeLabel("Substation2LineSegment").sourceLabel("Substation").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSegment2Substation").sourceLabel("LineSegment").targetLabel("Substation").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSegment2StationHouse").sourceLabel("LineSegment").targetLabel("StationHouse").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("StationHouse2LineSegment").sourceLabel("StationHouse").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSegment2LineSegment").sourceLabel("LineSegment").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSegment2LineSwitch").sourceLabel("LineSegment").targetLabel("LineSwitch").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSwitch2LineSegment").sourceLabel("LineSwitch").targetLabel("LineSegment").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("StationHouse2LineSwitch").sourceLabel("StationHouse").targetLabel("LineSwitch").properties("source_label",
"target_label").ifNotExist().create();
schema.edgeLabel("LineSwitch2StationHouse").sourceLabel("LineSwitch").targetLabel("StationHouse").properties("source_label",
"target_label").ifNotExist().create();
```
---
### 2. struct_1.json(edges 部分 field_mapping 必须为 from/to,且 value_mapping 精确过滤)
```json
{
"vertices": [
{
"label": "Substation",
"input": {
"type": "file",
"path": "vertex_substation-1.csv",
"format": "CSV",
"header": ["id", "name", "objectType", "objectHandle", "deviceType",
"internalEndpointNo", "terminalNo", "usage", "powerFlowDirection",
"belongSubstation", "belongFeeder"],
"charset": "UTF-8"
},
"null_values": ["NULL", "null"]
},
// 其他顶点类型同理
],
"edges": [
{
"label": "Substation2LineSegment",
"source": ["from"],
"target": ["to"],
"input": {
"type": "file",
"path": "power_edges_to.csv",
"format": "CSV",
"header": ["id", "from", "to", "label", "source_label", "target_label"]
},
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO": "Substation2LineSegment" },
"source_label": { "Substation": "Substation" },
"target_label": { "LineSegment": "LineSegment" }
}
},
{
"label": "LineSegment2Substation",
"source": ["from"],
"target": ["to"],
"input": { "type": "file", "path": "power_edges_to.csv", "format": "CSV",
"header": ["id", "from", "to", "label", "source_label", "target_label"] },
"field_mapping": { "from": "from", "to": "to" },
"value_mapping": {
"label": { "CONNECTED_TO": "LineSegment2Substation" },
"source_label": { "LineSegment": "LineSegment" },
"target_label": { "Substation": "Substation" }
}
},
// 其余边类型同理,确保每种类型的连接都写一段,且 value_mapping 精确过滤
// ...
]
}
```
---
### 3. 重点排查和修正建议
- **所有 vertex_xxx.csv、power_edges_to.csv 文件必须无重复行、无空行、无多余空格,id 唯一且和边数据完全一致。**
- **struct_1.json 的 edges 部分 field_mapping 必须全部为 { "from": "from", "to": "to"
},不能有 id。**
- **每个 edgeLabel 的 value_mapping 必须精确过滤 label、source_label、target_label,且和
schema.groovy 的 sourceLabel/targetLabel 完全一致。**
- **如果某些设备没有连接关系,优先检查:**
- 该设备 id 是否在所有相关 vertex_xxx.csv 只出现一次且无空格。
- 该设备 id 是否在 power_edges_to.csv 的 from/to 列出现,且 source_label/target_label 正确。
- struct_1.json 是否有对应 edgeLabel 配置,且 value_mapping 正确。
- **如有重复顶点或边,建议先 deduplicate(去重)所有 CSV 文件。**
---
### 4. 边数量异常的原因
- 你的 power_edges_to.csv 里每条边正反都写了(A→B、B→A),且 struct.json 里每种 edgeLabel
又都配置了一遍,导致边数量比顶点多很多。只要数据无重复、配置无误,这不是问题。
---
### 5. 设备无连接关系的根本原因
- 99% 是因为 struct.json 的 field_mapping 写成了 { "from": "id", "to": "id" } 或
value_mapping 不精确,导致边被跳过或关联不到顶点。
- 还有可能是顶点 CSV 有重复、空行、主键不一致、隐藏字符等问题。
---
如需完整修正版 struct_1.json 和 schema_1.groovy,可以参考上面模板,按你的实际数据类型和连接关系补全所有 edgeLabel 和
value_mapping。只要严格按上述方式配置,所有设备的连接关系都能导入且无报错。
如还有具体设备无连接,建议贴出该设备在所有相关 CSV 的原始行和 struct.json 的对应 edgeLabel 配置,我可以帮你逐条定位。
<!-- Dosu Comment Footer -->
*To reply, just mention
[@dosu](https://go.dosu.dev/dosubot?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=reply-with-mention&utm_term=apache%2Fhugegraph).*
---
Share context across your team and agents. Try
[Dosu](https://dosu.dev?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=knowledge-infrastructure-tagline&utm_term=apache%2Fhugegraph).
[](https://app.dosu.dev/response-feedback?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=knowledge-infrastructure-feedback&utm_term=apache%2Fhugegraph#message_id=b7a552ea-0e2a-47cf-8f3d-f821410f74fa)
[](https://github.dosu.com/apache/hugegraph?utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=knowledge-infrastructure-learn-repo&utm_term=apache%2Fhugegraph)
[](https://app.dosu.dev/signup?referrer=openSource&source=github-footer&utm_source=github&utm_medium=bot-comment&utm_campaign=github-comment-footer-20260415&utm_content=knowledge-infrastructure-add-team&utm_term=apache%2Fhugegraph)
GitHub link:
https://github.com/apache/hugegraph/discussions/2917#discussioncomment-15226200
----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]