Hello!
I found the error for failing import of relationships:

in the header of relationships should have been specified a keyword used 
for automatic indexing, like:
#node.csv
id:int:KEYWORD

# rels.csv
id:int:KEYWORD  id:int:KEYWORD

and than in batch properties:
batch_import.node_index.KEYWORD=exact

I had some issues for finding duplicates on my node.csv file;
so I removed duplicates and I was able to batch import 4M nodes and 100M 
relationships in a macbook 8GB in about 1h.

Now I have issues for querying the graph, because it looks like indexes 
were not properly applied.

As example, I couldn't find the KEYWORD label in my db.

I am gonna post this questions in another thread, cuz import was successful 
now.



Il giorno mercoledì 13 agosto 2014 17:36:08 UTC+2, gg4u ha scritto:
>
> Hello, 
>
> I split the post 
> https://groups.google.com/forum/#!topic/neo4j/EVdq1qUaFQY
> there i asked more general questions on the performance for bulk 
> importing, 
> here I focus on the batch importer.
>
> I am experiencing issues in importing relationships, I cannot figure out 
> the reason.
> I generated my two node.csv and rels2.csv, please see them in attachment.
> The import fails on the first line: it cannot find the ending node of the 
> first relationship, while that node is in node.csv
>
> To index nodes, I am using this batch.properties:
>
> batch_import.keep_db=true
> batch_import.node_index.myindexname=fulltext
> batch_import.node_index.id=exact
> batch_import.node_index.node_auto_index=exact
>
> I tried to manually remove all the lines and make a trivial import with 
> two nodes of the first relationship, but kept on failing.
>
> Only if I remove *all* the nodes from the rels2.csv, i am able to import 
> nodes with no relationships.
>
> So I think the problem is in that file, I read the documentation and maybe 
> I misunderstood something; as far as I understood, with 
> index.node_auto_index I should be able to use my custom node.ids (integer, 
> in my case, not in progressive and continuos order).
>
> *Could you help in shedding light on the issue with my rels.csv?*
>
> Also, *very important:*
> I want to index my nodes.id so to make a *constraint on ids to avoid 
> duplicates.*
> Instead, each time I run the importer, I noticed that nodes are created as 
> duplicates (even if it fails the full import with relationships).
>
> How to create *constraints* so that there are no duplicates in the batch 
> importer?
> This is important if I want to update my graph on bulk, e.g for keeping 
> relationships which have been created on the server after the bulk, or for 
> uploading new subgraphs.
>
>
> Thank you very much!
>
>
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to