I don't have access to the server now but here are almost everything you 
need
OS: ubuntu 14.04 server
RAM: 16 GB
DISK: SSD 1TB
quad core CPU 3.0

I have *two *CSV files. One file to create the nodes and the other to 
create the relationships
The one I used to create the nodes is similar too the following

url_original,scheme,ext,path,netloc
http://www.test.com/test.php?id=1, http, php, test.php, www.test.php
http://www.test2.com/test2.php?id=1, http, php, test2.php, www.test2.php
http://www.test3.com/test3.php?id=1, http, php, test3.php, www.test3.php
http://www.test4.com/test4.php?id=1, http, php, test4.php, www.test4.php
http://www.test5.com/test5.php?id=1, http, php, test5.php, www.test5.php



and my query to insert the nodes into the graph is
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/home/test/nodes.csv" AS csv
MERGE (m:URL {url_original: csv.url_original})
ON CREATE set m.scheme=csv.scheme, m.netloc=csv.netloc, m.path=csv.path, 
m.ext=csv.ext


The other CSV file contains the relations. it looks similar to the 
following file

source,no_requests,response_code,origin,target
http://www.test.com/index.php,1,200,embedded,http://www.test.com/style.css
http://www.test.com/index.php,1,200,embedded,http://www.test.com/logo.jpg
http://www.test.com/index.php,1,200,embedded,http://www.test.com/arrow.jpw
http://www.test.com/index.php,1,200,embedded,http://www.test.com/jquery.js


and the query I use is the following

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/home/test/relations.csv" AS csv
MATCH (s:URL {url_original: csv.source})
MATCH (t:URL {url_original: csv.dest})
CREATE UNIQUE (s)-[r:VISITED {no_requests:toInt(csv.no_requests), 
response_code=toInt(csv.response_code), origin=csv.origin}]->(t)



I have an index on URL:url_original

As I told I don't have access to the server now so I wont be able to 
provide you with messages.log but I will do that ASAP.



On Sunday, June 21, 2015 at 6:33:38 PM UTC+3, Michael Hunger wrote:
>
> Please share the structure of your csv, your query, your configuration 
> (OS, RAM, DISK etc) and your graph.db/messages.log
> And how you run your query.
>
> On Sun, Jun 21, 2015 at 3:24 PM, Ibrahim El-sayed <[email protected] 
> <javascript:>> wrote:
>
>> I have a large CSV file that I want to insert into neo4j 
>> I use the periodic commit method to commit from my CSV to the server 
>> since this supposed to be the ideal case to deal with big data. 
>> I have created small test CSV files around 7Mb each one. I tried to do 
>> periodic commit on these files however when I send the query from the neo4j 
>> web interface it keep showing "processing" and it never returns !!! ?? what 
>> might be the problem ?? or how to make sure that my data has been 
>> processed. 
>>
>>  I also would like to know the fastest way to insert and query data from 
>> neo4j given that my data set is large. I would like to insert around 
>> 5Million nodes and 5 millions relations. I don't see it feasible with the 
>> current performance though !! 
>>
>> Regards
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to