Hello! I found the error for failing import of relationships: in the header of relationships should have been specified a keyword used for automatic indexing, like: #node.csv id:int:KEYWORD
# rels.csv id:int:KEYWORD id:int:KEYWORD and than in batch properties: batch_import.node_index.KEYWORD=exact I had some issues for finding duplicates on my node.csv file; so I removed duplicates and I was able to batch import 4M nodes and 100M relationships in a macbook 8GB in about 1h. Now I have issues for querying the graph, because it looks like indexes were not properly applied. As example, I couldn't find the KEYWORD label in my db. I am gonna post this questions in another thread, cuz import was successful now. Il giorno mercoledì 13 agosto 2014 17:36:08 UTC+2, gg4u ha scritto: > > Hello, > > I split the post > https://groups.google.com/forum/#!topic/neo4j/EVdq1qUaFQY > there i asked more general questions on the performance for bulk > importing, > here I focus on the batch importer. > > I am experiencing issues in importing relationships, I cannot figure out > the reason. > I generated my two node.csv and rels2.csv, please see them in attachment. > The import fails on the first line: it cannot find the ending node of the > first relationship, while that node is in node.csv > > To index nodes, I am using this batch.properties: > > batch_import.keep_db=true > batch_import.node_index.myindexname=fulltext > batch_import.node_index.id=exact > batch_import.node_index.node_auto_index=exact > > I tried to manually remove all the lines and make a trivial import with > two nodes of the first relationship, but kept on failing. > > Only if I remove *all* the nodes from the rels2.csv, i am able to import > nodes with no relationships. > > So I think the problem is in that file, I read the documentation and maybe > I misunderstood something; as far as I understood, with > index.node_auto_index I should be able to use my custom node.ids (integer, > in my case, not in progressive and continuos order). > > *Could you help in shedding light on the issue with my rels.csv?* > > Also, *very important:* > I want to index my nodes.id so to make a *constraint on ids to avoid > duplicates.* > Instead, each time I run the importer, I noticed that nodes are created as > duplicates (even if it fails the full import with relationships). > > How to create *constraints* so that there are no duplicates in the batch > importer? > This is important if I want to update my graph on bulk, e.g for keeping > relationships which have been created on the server after the bulk, or for > uploading new subgraphs. > > > Thank you very much! > > > > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
