Yep, it would be also interesting how you ran this? With neo4j-shell? Against a running server? Did you configure any RAM or memory mapping setting in neo4j.properties?
Check out this blog post for some hints on memory config: http://blog.bruggen.com/2014/02/some-neo4j-import-tweaks-what-and-where.html?view=sidebar Note that on windows the heap settings include the mmio settings unlike other OS'es. Michael Am 04.03.2014 um 17:22 schrieb Mark Needham <[email protected]>: > Hi Aram, > > * Do you have any other information of the spec of the machine you're running > this on? e.g. how much RAM etc > * Have you tried upping the value to PERIODIC COMMIT? Perhaps try it out with > a smaller subset of the data to measure the impact - try it with values of > 1,000 / 10,000 perhaps. > * I think it would be interesting to pull out some other things as nodes as > well - might lead to more interesting queries e.g. CEO, Location, Registered > Agent, DOS Process, Jurisdiction could all be nodes that link back to a DOS. > > Let me know if any of that doesn't make sense. > Mark > > > On 4 March 2014 15:54, Aram Chung <[email protected]> wrote: > Hi, > > I was asked to post this here by Mark Needham (@markhneedham) who thought my > query took longer than it should. > > I'm trying to see how graph databases could be used in investigative > journalism: I was loading in New York State's Active Corporations: Beginning > 1800 data from > https://data.ny.gov/Economic-Development/Active-Corporations-Beginning-1800/n9v6-gdp6 > as a 1964486-row csv (and deleted all U+F8FF characters, because I was > getting "[null] is not a supported property value"). The Cypher query I used > was > > USING PERIODIC COMMIT 500 > LOAD CSV > FROM > "file://path/to/csv/Active_Corporations___Beginning_1800__without_header__wonky_characters_fixed.csv" > AS company > CREATE (:DataActiveCorporations > { > DOS_ID:company[0], > Current_Entity_Name:company[1], > Initial_DOS_Filing_Date:company[2], > County:company[3], > Jurisdiction:company[4], > Entity_Type:company[5], > > DOS_Process_Name:company[6], > DOS_Process_Address_1:company[7], > DOS_Process_Address_2:company[8], > DOS_Process_City:company[9], > DOS_Process_State:company[10], > DOS_Process_Zip:company[11], > > CEO_Name:company[12], > CEO_Address_1:company[13], > CEO_Address_2:company[14], > CEO_City:company[15], > CEO_State:company[16], > CEO_Zip:company[17], > > Registered_Agent_Name:company[18], > Registered_Agent_Address_1:company[19], > Registered_Agent_Address_2:company[20], > Registered_Agent_City:company[21], > Registered_Agent_State:company[22], > Registered_Agent_Zip:company[23], > > Location_Name:company[24], > Location_Address_1:company[25], > Location_Address_2:company[26], > Location_City:company[27], > Location_State:company[28], > Location_Zip:company[29] > } > ); > > Each row is one node so it's as close to the raw data as possible. The idea > is loosely that these nodes will be linked with new nodes representing people > and addresses verified by reporters. > > This is what I got: > > +-------------------+ > | No data returned. | > +-------------------+ > Nodes created: 1964486 > Properties set: 58934580 > Labels added: 1964486 > 4550855 ms > > Some context information: > Neo4j Milestone Release 2.1.0-M01 > Windows 7 > java version "1.7.0_03" > > Best, > Aram > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
