Hi, I am trying to build a graph database as part of a project, maybe convince the company of choosing neo4j, but I'm miserably failing right now:
Background: We make frameworks used by millions of users, for analytics purposes, we register as "events" every action that comes to the server. We end up with 100m+ event a day. I am trying to see if neo4j is a viable option, so I am importing few data for testing as follow: Creators, Users, Devs, Applications, Devices: ~100K node each type. Contents: ~1M nodes Events: 2B nodes Importing devs and so was quite fast once I optimized the cypher commands, but importing the Events was quite hellish and slow. It took 3 days just to create the nodes (not even merge) which isn't usable in our situation for real time (we have more traffic daily than the subset I am trying to import). Now, I am trying to create the relationships, it's taking 4 days for one type of relationships, and nowhere near finished ... I still have 2 extra relations. Using Explain command, i see that finding the node by Unique Index means a lock on the index, meaning I can't split my script and run in on parallel shell processes. Using neo4j-import doesn't work on existing databases. *Is there any solution?* More details on all commands and a sample from Arrows here: https://gist.github.com/Einharch/23a31f869787950a898fed051e1a6ee0 -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
