Hi

we are running our first steps with Neo4j and used various alternatives to 
create an initial database

1) we used the Java API with an embedded database
here 
https://github.com/linked-swissbib/swissbib-metafacture-commands/blob/neo4j-tests/src/main/java/org/swissbib/linked/mf/writer/NeoIndexer.java#L76
a transaction is closed which surrounds 20.000 nodes with relationships to 
around 40.000 other nodes. 
We are surprised the Transaction.close() method needs up to 30 seconds to 
write these nodes to disk


2) then I wanted to compare my results with the neo4j-import script 
provided by the Neo4J-server
Using this method I have difficulties with the format of the csv-files

My small examples:
first node file:
lsId:ID(localsignature),:LABEL
"NEBIS/002527587",LOCALSIGNATURE
"OCoLC/637556711",LOCALSIGNATURE



second node file:
brId:ID(bibliographicresource),active,:LABEL
146404300,true,BIBLIOGRAPHICRESOURCE


relationship file
:START_ID(bibliographicresource),:END_ID(localsignature),:TYPE
146404300,"NEBIS/002527587",SIGNATUREOF
146404300,"OCoLC/637556711",SIGNATUREOF

./neo4j-import --into [path-to-db]/test.db/ --nodes files/br.csv --nodes 
files/br.csv --relationships:SIGNATUREOF files/signatureof.csv
which throws the exception 

Done in 191ms
Prepare node index
Exception in thread "Thread-3" 
org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.DuplicateInputIdException:
 
Id '146404300' is defined more than once in bibliographicresource, at least 
at 
/home/swissbib/environment/tools/neo4j-community-2.3.2/bin/files/br.csv:2 
and 
/home/swissbib/environment/tools/neo4j-community-2.3.2/bin/files/br.csv:2
    at 
org.neo4j.unsafe.impl.batchimport.input.BadCollector$2.exception(BadCollector.java:107)
    at 
org.neo4j.unsafe.impl.batchimport.input.BadCollector.checkTolerance(BadCollector.java:176)
    at 
org.neo4j.unsafe.impl.batchimport.input.BadCollector.collectDuplicateNode(BadCollector.java:96)
    at 
org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.EncodingIdMapper.detectDuplicateInputIds(EncodingIdMapper.java:590)
    at 
org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.EncodingIdMapper.buildCollisionInfo(EncodingIdMapper.java:494)
    at 
org.neo4j.unsafe.impl.batchimport.cache.idmapping.string.EncodingIdMapper.prepare(EncodingIdMapper.java:282)
    at 
org.neo4j.unsafe.impl.batchimport.IdMapperPreparationStep.process(IdMapperPreparationStep.java:54)
    at 
org.neo4j.unsafe.impl.batchimport.staging.LonelyProcessingStep$1.run(LonelyProcessingStep.java:56)
Duplicate input ids that would otherwise clash can be put into separate id 
space, read more about how to use id spaces in the manual: 
http://neo4j.com/docs/2.3.2/import-tool-header-format.html#import-tool-id-spaces
Caused by:Id '146404300' is defined more than once in 
bibliographicresource, at least at 
/home/swissbib/environment/tools/neo4j-community-2.3.2/bin/files/br.csv:2 
and 
/home/swissbib/environment/tools/neo4j-community-2.3.2/bin/files/br.csv:2


I can't see any differences in the documentation of
http://neo4j.com/docs/2.3.2/import-tool-header-format.html#import-tool-id-spaces
because I tried to use the ID space notation (as far as I can see...)

Thanks for any hints!

Günter

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to