Hi we are running our first steps with Neo4j and used various alternatives to create an initial database
1) we used the Java API with an embedded database here https://github.com/linked-swissbib/swissbib-metafacture-commands/blob/neo4j-tests/src/main/java/org/swissbib/linked/mf/writer/NeoIndexer.java#L76 a transaction is closed which surrounds 20.000 nodes with relationships to around 40.000 other nodes. We are surprised the Transaction.close() method needs up to 30 seconds to write these nodes to disk 2) then I wanted to compare my results with the neo4j-import script provided by the Neo4J-server Using this method I have difficulties with the format of the csv-files My small examples: first node file: lsId:ID(localsignature),:LABEL "NEBIS/002527587",LOCALSIGNATURE "OCoLC/637556711",LOCALSIGNATURE second node file: brId:ID(bibliographicresource),active,:LABEL 146404300,true,BIBLIOGRAPHICRESOURCE relationship file :START_ID(bibliographicresource),:END_ID(localsignature),:TYPE 146404300,"NEBIS/002527587",SIGNATUREOF 146404300,"OCoLC/637556711",SIGNATUREOF ./neo4j-import --into [path-to-db]/test.db/ --nodes files/br.csv --nodes files/br.csv --relationships:SIGNATUREOF files/signatureof.csv which throws the exception Done in 191ms Prepare node index Exception in thread "Thread-3" org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.DuplicateInputIdException: Id '146404300' is defined more than once in bibliographicresource, at least at /home/swissbib/environment/tools/neo4j-community-2.3.2/bin/files/br.csv:2 and /home/swissbib/environment/tools/neo4j-community-2.3.2/bin/files/br.csv:2 at org.neo4j.unsafe.impl.batchimport.input.BadCollector$2.exception(BadCollector.java:107) at org.neo4j.unsafe.impl.batchimport.input.BadCollector.checkTolerance(BadCollector.java:176) at org.neo4j.unsafe.impl.batchimport.input.BadCollector.collectDuplicateNode(BadCollector.java:96) at org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.EncodingIdMapper.detectDuplicateInputIds(EncodingIdMapper.java:590) at org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.EncodingIdMapper.buildCollisionInfo(EncodingIdMapper.java:494) at org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.EncodingIdMapper.prepare(EncodingIdMapper.java:282) at org.neo4j.unsafe.impl.batchimport.IdMapperPreparationStep.process(IdMapperPreparationStep.java:54) at org.neo4j.unsafe.impl.batchimport.staging.LonelyProcessingStep$1.run(LonelyProcessingStep.java:56) Duplicate input ids that would otherwise clash can be put into separate id space, read more about how to use id spaces in the manual: http://neo4j.com/docs/2.3.2/import-tool-header-format.html#import-tool-id-spaces Caused by:Id '146404300' is defined more than once in bibliographicresource, at least at /home/swissbib/environment/tools/neo4j-community-2.3.2/bin/files/br.csv:2 and /home/swissbib/environment/tools/neo4j-community-2.3.2/bin/files/br.csv:2 I can't see any differences in the documentation of http://neo4j.com/docs/2.3.2/import-tool-header-format.html#import-tool-id-spaces because I tried to use the ID space notation (as far as I can see...) Thanks for any hints! Günter -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
