Yes, I also tried USING PERIODIC COMMIT with 10000 and 50000 values. Yes, as stated I set indexes on the name properties in anticipation of the queries.
I was wrong in saying that it 'failed' because it didn't (except when running in the web browser and it timed out). What I meant was that it was taking enormous amounts of time. Much more time than the other imports if scaled linearly. I never did let it finish because I could no longer wait. With imports where there is a MATCH statement, should I expect the running time to be excessive in relation to imports which simply CREATEs nodes? On Friday, June 6, 2014 12:10:54 PM UTC-6, Michael Hunger wrote: > > How did it fail? > > Did you try USING PERIODIC COMMIT 10000 ? > > Do you have an index for : :User(name) and :Group(name) ? > > > On Fri, Jun 6, 2014 at 12:34 AM, Eric Olson <[email protected] > <javascript:>> wrote: > >> I have read some other topics on this and am still coming up short on a >> satisfying solution. >> >> I am: >> >> - Populating my DB using the new CSV import query in Cypher >> - Using the Neo4j shell >> - Including the "USING PERIODIC COMMIT" statement >> >> I have: >> >> - Successfully imported a 10,000 line file in ~2 seconds >> - Successfully imported a 500,000 line file in ~20 seconds >> - Successfully imported a 5,000,000 line file in ~3 minutes >> - FAILED to import a 100,000,000 line file! >> >> The first 3 imports were just to create some simple nodes. The failed >> import was to create relationships and the statement looks like: >> >> >> USING PERIODIC COMMIT 100000 >> LOAD CSV WITH HEADERS FROM 'file:/mcpdata/5_usr-grp.csv' AS line >> MATCH (usr:User { name: line.user }), (grp:Group { name: line.group }) >> CREATE (user)-[:IN]->(grp) >> >> >> And yes, I have set indexes on the name properties of each so that they >> can be retrieved quickly. >> >> This has been spinning for well over an hour and still no completion. I >> am assuming based on the other timings that it should take about 30 minutes >> + query times to retrieve the objects I am making the relationship between. >> Is it still the MATCH query that is killing me here? If on average it takes >> 10ms for each object retrieval, then with 100M lines (200M total retrievals >> then), this could add up to an additional 23 days of running time :) >> >> IS THERE A BETTER WAY? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
