I don't have access to the server now but here are almost everything you need OS: ubuntu 14.04 server RAM: 16 GB DISK: SSD 1TB quad core CPU 3.0
I have *two *CSV files. One file to create the nodes and the other to create the relationships The one I used to create the nodes is similar too the following url_original,scheme,ext,path,netloc http://www.test.com/test.php?id=1, http, php, test.php, www.test.php http://www.test2.com/test2.php?id=1, http, php, test2.php, www.test2.php http://www.test3.com/test3.php?id=1, http, php, test3.php, www.test3.php http://www.test4.com/test4.php?id=1, http, php, test4.php, www.test4.php http://www.test5.com/test5.php?id=1, http, php, test5.php, www.test5.php and my query to insert the nodes into the graph is USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:/home/test/nodes.csv" AS csv MERGE (m:URL {url_original: csv.url_original}) ON CREATE set m.scheme=csv.scheme, m.netloc=csv.netloc, m.path=csv.path, m.ext=csv.ext The other CSV file contains the relations. it looks similar to the following file source,no_requests,response_code,origin,target http://www.test.com/index.php,1,200,embedded,http://www.test.com/style.css http://www.test.com/index.php,1,200,embedded,http://www.test.com/logo.jpg http://www.test.com/index.php,1,200,embedded,http://www.test.com/arrow.jpw http://www.test.com/index.php,1,200,embedded,http://www.test.com/jquery.js and the query I use is the following USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM "file:/home/test/relations.csv" AS csv MATCH (s:URL {url_original: csv.source}) MATCH (t:URL {url_original: csv.dest}) CREATE UNIQUE (s)-[r:VISITED {no_requests:toInt(csv.no_requests), response_code=toInt(csv.response_code), origin=csv.origin}]->(t) I have an index on URL:url_original As I told I don't have access to the server now so I wont be able to provide you with messages.log but I will do that ASAP. On Sunday, June 21, 2015 at 6:33:38 PM UTC+3, Michael Hunger wrote: > > Please share the structure of your csv, your query, your configuration > (OS, RAM, DISK etc) and your graph.db/messages.log > And how you run your query. > > On Sun, Jun 21, 2015 at 3:24 PM, Ibrahim El-sayed <[email protected] > <javascript:>> wrote: > >> I have a large CSV file that I want to insert into neo4j >> I use the periodic commit method to commit from my CSV to the server >> since this supposed to be the ideal case to deal with big data. >> I have created small test CSV files around 7Mb each one. I tried to do >> periodic commit on these files however when I send the query from the neo4j >> web interface it keep showing "processing" and it never returns !!! ?? what >> might be the problem ?? or how to make sure that my data has been >> processed. >> >> I also would like to know the fastest way to insert and query data from >> neo4j given that my data set is large. I would like to insert around >> 5Million nodes and 5 millions relations. I don't see it feasible with the >> current performance though !! >> >> Regards >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
