I tried to batch import some mock up data to just test how long this would take. The file structure I use is as follows:
The vertex file: i:id name l:label 1 1 lable 2 2 lable ... Edge file: start end weight 1 2 25 ... In the real data each Vertex has a StringID (and maybe a Long DB id) and needs to be Indexed on those as well. Would it be faster to map to long ID's first (the DB id) before loading it in, or can I somehow use the string ID's. Finally with the current check I can get 15 million nodes and 150 million edges in ~2 hours. How can I speed this up further since the real data will be slower since it has more properties and needs to be indexed. Is it possible to distribute this between machines? thanks for the help Martin On Wednesday, April 16, 2014 12:23:12 PM UTC+2, Michael Hunger wrote: > > 1. yes, that's what neo4j is built for > 2. Actually the batch-import is fast even for large datasets. > 3. how about the Neo4j-Browser, which comes with Neo4j out of the box, see > this video for an example: https://www.youtube.com/watch?v=qbZ_Q-YnHYo > > Cheers > > Michael > > > Am 16.04.2014 um 09:54 schrieb Martin Neumann > <[email protected]<javascript:> > >: > > Hej everyone > > > I want to build the following pipeline (I'm prototyping right now): > > *Description:* > Data -> Map/Reduce Giraph Pipeline -> graph -> Neo4j -> application > > The system would run nightly rebuilding/replacing the Graph. It would then > be dumped into a graph DB to make it possible for the application layer to > query. > The graph is 20 million V and 200 million E currently in edgelist format > with String vertex ID's und key/value pair data on edges (one of them is > the edge type). The application layer only reads from that graph. > > *Here my questions:* > 1. Is Neo4j the right tool for the job? (I have no updates, no > transactions but lots of queries) > 2. What is the best way to import the data into Neo4j (I have heard the > batch import can be slow for large data, and this would be a bottleneck) > 3. Is there a simply online query tool I can hand to the application > developer to "browse" the graph? > > cheeers Matin > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > For more options, visit https://groups.google.com/d/optout. > > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
