Oh and btw. I would LOVE to see a blog post from you about what you're working on!
Thanks so much Michael Am 05.03.2014 um 12:00 schrieb Michael Hunger <[email protected]>: > I just tested your file on MacOS with these settings > and got 6:30 for the 2m rows > > EXTRA_JVM_ARGUMENTS="-Xmx6G -Xms6G -Xmn1G" > > on windows you have to add the memory from the mmio settings in > neo4j.properties to the heap > > cat conf/neo4j.properties > # Default values for the low-level graph engine > neostore.nodestore.db.mapped_memory=200M > neostore.relationshipstore.db.mapped_memory=1G > neostore.propertystore.db.mapped_memory=500M > neostore.propertystore.db.strings.mapped_memory=250M > neostore.propertystore.db.arrays.mapped_memory=0M > > USING PERIODIC COMMIT 10000 >> LOAD CSV >> FROM >> "file:///Users/mh/Downloads/Active_Corporations___Beginning_1800_no_head.csv" >> AS company >> CREATE (:DataActiveCorporations >> { >> DOS_ID:company[0], >> Current_Entity_Name:company[1], >> Initial_DOS_Filing_Date:company[2], > ...... >> Registered_Agent_Zip:company[23], >> >> Location_Name:company[24], >> Location_Address_1:company[25], >> Location_Address_2:company[26], >> Location_City:company[27], >> Location_State:company[28], >> Location_Zip:company[29] >> } >> ); > > +-------------------+ > | No data returned. | > +-------------------+ > Nodes created: 1964486 > Properties set: 58934580 > Labels added: 1964486 > 391059 ms > > > > Am 05.03.2014 um 08:34 schrieb Michael Hunger > <[email protected]>: > >> Oh and if you use neo4j-shell without server you have to set the heap in >> bin\Neo4jShell.bat in EXTRA_JVM_ARGUMENTS="-Xmx4G -Xms4G -Xmn1G" >> >> and call >> >> bin\Neo4jShell -conf conf\neo4j.properties -path data\graph.db >> >> Am 05.03.2014 um 08:29 schrieb Michael Hunger >> <[email protected]>: >> >>> Yep, >>> >>> it would be also interesting how you ran this? With neo4j-shell? Against a >>> running server? >>> Did you configure any RAM or memory mapping setting in neo4j.properties? >>> >>> Check out this blog post for some hints on memory config: >>> http://blog.bruggen.com/2014/02/some-neo4j-import-tweaks-what-and-where.html?view=sidebar >>> Note that on windows the heap settings include the mmio settings unlike >>> other OS'es. >>> >>> Michael >>> >>> Am 04.03.2014 um 17:22 schrieb Mark Needham <[email protected]>: >>> >>>> Hi Aram, >>>> >>>> * Do you have any other information of the spec of the machine you're >>>> running this on? e.g. how much RAM etc >>>> * Have you tried upping the value to PERIODIC COMMIT? Perhaps try it out >>>> with a smaller subset of the data to measure the impact - try it with >>>> values of 1,000 / 10,000 perhaps. >>>> * I think it would be interesting to pull out some other things as nodes >>>> as well - might lead to more interesting queries e.g. CEO, Location, >>>> Registered Agent, DOS Process, Jurisdiction could all be nodes that link >>>> back to a DOS. >>>> >>>> Let me know if any of that doesn't make sense. >>>> Mark >>>> >>>> >>>> On 4 March 2014 15:54, Aram Chung <[email protected]> wrote: >>>> Hi, >>>> >>>> I was asked to post this here by Mark Needham (@markhneedham) who thought >>>> my query took longer than it should. >>>> >>>> I'm trying to see how graph databases could be used in investigative >>>> journalism: I was loading in New York State's Active Corporations: >>>> Beginning 1800 data from >>>> https://data.ny.gov/Economic-Development/Active-Corporations-Beginning-1800/n9v6-gdp6 >>>> as a 1964486-row csv (and deleted all U+F8FF characters, because I was >>>> getting "[null] is not a supported property value"). The Cypher query I >>>> used was >>>> >>>> USING PERIODIC COMMIT 500 >>>> LOAD CSV >>>> FROM >>>> "file://path/to/csv/Active_Corporations___Beginning_1800__without_header__wonky_characters_fixed.csv" >>>> AS company >>>> CREATE (:DataActiveCorporations >>>> { >>>> DOS_ID:company[0], >>>> Current_Entity_Name:company[1], >>>> Initial_DOS_Filing_Date:company[2], >>>> County:company[3], >>>> Jurisdiction:company[4], >>>> Entity_Type:company[5], >>>> >>>> DOS_Process_Name:company[6], >>>> DOS_Process_Address_1:company[7], >>>> DOS_Process_Address_2:company[8], >>>> DOS_Process_City:company[9], >>>> DOS_Process_State:company[10], >>>> DOS_Process_Zip:company[11], >>>> >>>> CEO_Name:company[12], >>>> CEO_Address_1:company[13], >>>> CEO_Address_2:company[14], >>>> CEO_City:company[15], >>>> CEO_State:company[16], >>>> CEO_Zip:company[17], >>>> >>>> Registered_Agent_Name:company[18], >>>> Registered_Agent_Address_1:company[19], >>>> Registered_Agent_Address_2:company[20], >>>> Registered_Agent_City:company[21], >>>> Registered_Agent_State:company[22], >>>> Registered_Agent_Zip:company[23], >>>> >>>> Location_Name:company[24], >>>> Location_Address_1:company[25], >>>> Location_Address_2:company[26], >>>> Location_City:company[27], >>>> Location_State:company[28], >>>> Location_Zip:company[29] >>>> } >>>> ); >>>> >>>> Each row is one node so it's as close to the raw data as possible. The >>>> idea is loosely that these nodes will be linked with new nodes >>>> representing people and addresses verified by reporters. >>>> >>>> This is what I got: >>>> >>>> +-------------------+ >>>> | No data returned. | >>>> +-------------------+ >>>> Nodes created: 1964486 >>>> Properties set: 58934580 >>>> Labels added: 1964486 >>>> 4550855 ms >>>> >>>> Some context information: >>>> Neo4j Milestone Release 2.1.0-M01 >>>> Windows 7 >>>> java version "1.7.0_03" >>>> >>>> Best, >>>> Aram >>>> >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "Neo4j" group. >>>> To unsubscribe from this group and stop receiving emails from it, send an >>>> email to [email protected]. >>>> For more options, visit https://groups.google.com/groups/opt_out. >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google Groups >>>> "Neo4j" group. >>>> To unsubscribe from this group and stop receiving emails from it, send an >>>> email to [email protected]. >>>> For more options, visit https://groups.google.com/groups/opt_out. >>> >> > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
