liuxiaocs7 opened a new issue, #502: URL: https://github.com/apache/incubator-hugegraph-toolchain/issues/502
### Bug Type (问题类型) others (please comment below) ### Before submit - [X] I had searched in the [issues](https://github.com/apache/hugegraph-toolchain/issues) and found no similar issues. ### Environment (环境信息) - Server Version: v1.0.0 - Toolchain Version: master - Data Size: xx vertices, xx edges <!-- (like 1000W 点, 9000W 边) --> ### Expected & Actual behavior (期望与实际表现) Spark Loader success but ` insertSuccessCnt` is not right. ``` 23/08/04 00:11:49 INFO HugeGraphSparkLoader: Finished load example/spark/vertex_software.json data 23/08/04 00:11:49 INFO DAGScheduler: Job 5 finished: foreachPartition at HugeGraphSparkLoader.java:154, took 1.023988 s 23/08/04 00:11:49 INFO DAGScheduler: ResultStage 3 (foreachPartition at HugeGraphSparkLoader.java:154) finished in 1.021 s 23/08/04 00:11:49 INFO HugeGraphSparkLoader: Finished load example/spark/edge_knows.json data 23/08/04 00:11:49 INFO DAGScheduler: Job 4 is finished. Cancelling potential speculative or zombie tasks for this job 23/08/04 00:11:49 INFO TaskSchedulerImpl: Killing all running tasks in stage 3: Stage finished 23/08/04 00:11:49 INFO DAGScheduler: Job 4 finished: foreachPartition at HugeGraphSparkLoader.java:154, took 1.025300 s 23/08/04 00:11:49 INFO HugeGraphSparkLoader: Finished load example/spark/vertex_person.json data 23/08/04 00:11:49 INFO HugeGraphSparkLoader: ------------The data load task is complete------------------- insertSuccessCnt: 0 --------------------------------------------- 23/08/04 00:11:49 INFO SparkUI: Stopped Spark web UI at http://192.168.34.164:4040 23/08/04 00:11:49 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 23/08/04 00:11:49 INFO MemoryStore: MemoryStore cleared 23/08/04 00:11:49 INFO BlockManager: BlockManager stopped 23/08/04 00:11:49 INFO BlockManagerMaster: BlockManagerMaster stopped 23/08/04 00:11:49 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 23/08/04 00:11:49 INFO SparkContext: Successfully stopped SparkContext 23/08/04 00:11:49 INFO SparkContext: SparkContext already stopped. 23/08/04 00:11:49 INFO SparkContext: SparkContext already stopped. 23/08/04 00:12:49 INFO ShutdownHookManager: Shutdown hook called 23/08/04 00:12:49 INFO ShutdownHookManager: Deleting directory /tmp/spark-41fba547-151e-4a5a-8982-fa9df80978db 23/08/04 00:12:49 INFO ShutdownHookManager: Deleting directory /tmp/spark-8c9b8b45-a9cf-4876-a719-b7446f4e46cd ``` ### Vertex/Edge example (问题点 / 边数据举例) _No response_ ### Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构) ```javascript // Define schema schema.propertyKey("name").asText().ifNotExist().create(); schema.propertyKey("age").asInt().ifNotExist().create(); schema.propertyKey("city").asText().ifNotExist().create(); schema.propertyKey("weight").asDouble().ifNotExist().create(); schema.propertyKey("lang").asText().ifNotExist().create(); schema.propertyKey("date").asText().ifNotExist().create(); schema.propertyKey("price").asDouble().ifNotExist().create(); schema.vertexLabel("person") .properties("name", "age", "city") .useCustomizeStringId() .nullableKeys("age", "city") .ifNotExist() .create(); schema.vertexLabel("software") .properties("name", "lang", "price") .useCustomizeStringId() .ifNotExist() .create(); schema.edgeLabel("knows") .sourceLabel("person") .targetLabel("person") .properties("date", "weight") .ifNotExist() .create(); schema.edgeLabel("created") .sourceLabel("person") .targetLabel("software") .properties("date", "weight") .ifNotExist() .create(); ``` { "vertices": [ { "label": "person", "input": { "type": "file", "path": "example/spark/vertex_person.json", "format": "JSON", "header": ["name", "age", "city"], "charset": "UTF-8", "skipped_line": { "regex": "(^#|^//).*" } }, "id": "name", "null_values": ["NULL", "null", ""] }, { "label": "software", "input": { "type": "file", "path": "example/spark/vertex_software.json", "format": "JSON", "header": ["id","name", "lang", "price","ISBN"], "charset": "GBK" }, "id": "name", "ignored": ["ISBN"] } ], "edges": [ { "label": "knows", "source": ["source_name"], "target": ["target_name"], "input": { "type": "file", "path": "example/spark/edge_knows.json", "format": "JSON", "date_format": "yyyyMMdd", "header": ["source_name","target_name", "date", "weight"] }, "field_mapping": { "source_name": "name", "target_name": "name" } } ] } ``` ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
