Cool! I want to improve the documentation so it's more obvious what to do in the future. Am I right in assuming that this is the page in the docs that you'd likely read about MERGE? http://neo4j.com/docs/stable/query-merge.html
On 4 November 2014 11:07, Jean Villedieu <[email protected]> wrote: > Looks like it is working :) > As usual, I'll do a blog post documenting all this. Thanks for your help > Mark! > > > On Tuesday, November 4, 2014 11:21:31 AM UTC+1, Mark Needham wrote: > >> Ah ok, so I think when you call toint on that row it's becoming null? >> >> $ return toint(""); >> +-----------+ >> | toint("") | >> +-----------+ >> | <null> | >> +-----------+ >> 1 row >> 1296 ms >> >> But in fact I'd suggest that what makes a company unique is the permalink >> so you don't actually need to have founded_year in the MERGE statement >> anyway. >> >> USING PERIODIC COMMIT 1000 >> LOAD CSV WITH HEADERS FROM "file:c:/Users/Jean/Downloads/CompaniesTEST.csv" >> AS line >> MERGE (a:COMPANY {permalink: line.permalink}) >> ON CREATE SET a.funding_total = line.funding_total_usd, a.funding_rounds >> = toInt(line.funding_rounds), a.founded_at = line.founded_at, >> a.founded_month >> = line.founded_month, a.founded_quarter = line.founded_quarter, >> a.founded_year = toInt(line.founded_year), a.first_funding_at = >> line.first_funding_at, a.last_funding_at = line.last_funding_at >> MERGE (b:CATEGORY {name: line.category_list}) >> MERGE (c:MARKET {name: line.market}) >> MERGE (d:STATUS {name: line.status}) >> MERGE (e:COUNTRY {name: line.country_code}) >> MERGE (f:STATE {name: line.state_code}) >> MERGE (g:REGION {name: line.region}) >> MERGE (h:CITY {name: line.city}) >> CREATE (a)-[:HAS_CATEGORY]->(b) >> CREATE (a)-[:HAS_MARKET]->(c) >> CREATE (a)-[:HAS_STATUS]->(d) >> CREATE (a)-[:HAS_COUNTRY]->(e) >> CREATE (a)-[:HAS_STATE]->(f) >> CREATE (a)-[:HAS_REGION]->(g) >> CREATE (a)-[:HAS_CITY]->(h) >> >> Make sure you have an index on :COMPANY before running or it'll be slow! >> >> CREATE INDEX ON :COMPANY(permalink) >> >> On 4 November 2014 10:17, Jean Villedieu <[email protected]> wrote: >> >>> I get : >>> cityLos Angelesname&TV Communicationscategory_list|Games|founded_at >>> first_funding_at04/06/2010permalink/organization/tv-communicationsmarket >>> Gamesfounded_quarterfounded_yearcountry_codeUSAhomepage_url >>> http://enjoyandtv.comfunding_total_usd4�000�000founded_month >>> funding_rounds2statusoperatingregionLos Angelesstate_codeCA >>> last_funding_at23/09/2010 >>> On Tuesday, November 4, 2014 11:11:34 AM UTC+1, Mark Needham wrote: >>>> >>>> Hmm ok, what about: >>>> >>>> LOAD CSV WITH HEADERS FROM "file:c:/Users/Jean/Downloads/CompaniesTEST.csv" >>>> AS line >>>> WITH line WHERE line.founded_year = "" >>>> RETURN line >>>> >>>> >>>> On 4 November 2014 10:10, Jean Villedieu <[email protected]> wrote: >>>> >>>>> Thanks Mark. It returns nothing : Returned 0 rows in 276 ms >>>>> >>>>> On Tuesday, November 4, 2014 10:53:20 AM UTC+1, Mark Needham wrote: >>>>>> >>>>>> Try this: >>>>>> >>>>>> LOAD CSV WITH HEADERS FROM >>>>>> "file:c:/Users/Jean/Downloads/CompaniesTEST.csv" >>>>>> AS line >>>>>> WITH line WHERE line.founded_year is null >>>>>> RETURN line >>>>>> >>>>>> Does it return any rows? >>>>>> >>>>>> On 4 November 2014 09:39, Jean Villedieu <[email protected]> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I have already experimented with the CSV import and now I'm trying >>>>>>> to use it with the Crunchbase database >>>>>>> <http://info.crunchbase.com/2013/04/crunchbase-meets-excel/>. >>>>>>> Thanks to Michael Hunger's posts, I have managed to get started but >>>>>>> I keep getting this message : >>>>>>> >>>>>>> Cannot merge node using null property value for founded_year (Failure >>>>>>> when processing URL 'file:c:/Users/Jean/Downloads/CompaniesTEST.csv' on >>>>>>> line 4 (which is the last row in the file). No rows seem to have been >>>>>>> committed. Note that this information might not be accurate.) >>>>>>> >>>>>>> I think it has to do with null values. >>>>>>> >>>>>>> Here is what I used to do the importation : >>>>>>> create constraint on (a:Company) assert a.permalink is unique >>>>>>> create constraint on (b:Category) assert b.category_list is unique >>>>>>> create constraint on (c:MARKET) assert c.market is unique >>>>>>> create constraint on (d:STATUS) assert d.status is unique >>>>>>> create constraint on (e:COUNTRY) assert e.country_code is unique >>>>>>> create constraint on (f:STATE) assert f.state_code is unique >>>>>>> create constraint on (g:REGION) assert g.region is unique >>>>>>> create constraint on (h:CITY) assert h.city is unique >>>>>>> CREATE INDEX ON :Company(permalink) >>>>>>> USING PERIODIC COMMIT 1000 >>>>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:c:/Users/Jean/Downloads/CompaniesTEST.csv" >>>>>>> AS line >>>>>>> MERGE (a:COMPANY {permalink: line.permalink, funding_total: >>>>>>> line.funding_total_usd, funding_rounds: toInt(line.funding_rounds), >>>>>>> founded_at: line.founded_at, founded_month: >>>>>>> line.founded_month, founded_quarter: line.founded_quarter, founded_year: >>>>>>> toInt(line.founded_year), first_funding_at: >>>>>>> line.first_funding_at, last_funding_at: line.last_funding_at}) >>>>>>> MERGE (b:CATEGORY {name: line.category_list}) >>>>>>> MERGE (c:MARKET {name: line.market}) >>>>>>> MERGE (d:STATUS {name: line.status}) >>>>>>> MERGE (e:COUNTRY {name: line.country_code}) >>>>>>> MERGE (f:STATE {name: line.state_code}) >>>>>>> MERGE (g:REGION {name: line.region}) >>>>>>> MERGE (h:CITY {name: line.city}) >>>>>>> CREATE (a)-[:HAS_CATEGORY]->(b) >>>>>>> CREATE (a)-[:HAS_MARKET]->(c) >>>>>>> CREATE (a)-[:HAS_STATUS]->(d) >>>>>>> CREATE (a)-[:HAS_COUNTRY]->(e) >>>>>>> CREATE (a)-[:HAS_STATE]->(f) >>>>>>> CREATE (a)-[:HAS_REGION]->(g) >>>>>>> CREATE (a)-[:HAS_CITY]->(h) >>>>>>> >>>>>>> FYI when I create the COMPANY nodes with permalink as a single >>>>>>> property, it works. What am I doing wrong? >>>>>>> The csv file I'm using can be found here : https://gist.github.com/ >>>>>>> jvilledieu/c3afe5bc21da28880a30 >>>>>>> >>>>>>> Thanks for your help! >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Neo4j" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Neo4j" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
