Looks like it is working :)
As usual, I'll do a blog post documenting all this. Thanks for your help
Mark!
On Tuesday, November 4, 2014 11:21:31 AM UTC+1, Mark Needham wrote:
>
> Ah ok, so I think when you call toint on that row it's becoming null?
>
> $ return toint("");
> +-----------+
> | toint("") |
> +-----------+
> | <null> |
> +-----------+
> 1 row
> 1296 ms
>
> But in fact I'd suggest that what makes a company unique is the permalink
> so you don't actually need to have founded_year in the MERGE statement
> anyway.
>
> USING PERIODIC COMMIT 1000
> LOAD CSV WITH HEADERS FROM
> "file:c:/Users/Jean/Downloads/CompaniesTEST.csv" AS line
> MERGE (a:COMPANY {permalink: line.permalink})
> ON CREATE SET a.funding_total = line.funding_total_usd, a.funding_rounds
> = toInt(line.funding_rounds), a.founded_at = line.founded_at,
> a.founded_month
> = line.founded_month, a.founded_quarter = line.founded_quarter,
> a.founded_year
> = toInt(line.founded_year), a.first_funding_at = line.first_funding_at,
> a.last_funding_at = line.last_funding_at
> MERGE (b:CATEGORY {name: line.category_list})
> MERGE (c:MARKET {name: line.market})
> MERGE (d:STATUS {name: line.status})
> MERGE (e:COUNTRY {name: line.country_code})
> MERGE (f:STATE {name: line.state_code})
> MERGE (g:REGION {name: line.region})
> MERGE (h:CITY {name: line.city})
> CREATE (a)-[:HAS_CATEGORY]->(b)
> CREATE (a)-[:HAS_MARKET]->(c)
> CREATE (a)-[:HAS_STATUS]->(d)
> CREATE (a)-[:HAS_COUNTRY]->(e)
> CREATE (a)-[:HAS_STATE]->(f)
> CREATE (a)-[:HAS_REGION]->(g)
> CREATE (a)-[:HAS_CITY]->(h)
>
> Make sure you have an index on :COMPANY before running or it'll be slow!
>
> CREATE INDEX ON :COMPANY(permalink)
>
> On 4 November 2014 10:17, Jean Villedieu <[email protected] <javascript:>
> > wrote:
>
>> I get :
>> cityLos Angelesname&TV Communicationscategory_list|Games|founded_at
>> first_funding_at04/06/2010permalink/organization/tv-communicationsmarket
>> Gamesfounded_quarterfounded_yearcountry_codeUSAhomepage_url
>> http://enjoyandtv.comfunding_total_usd4�000�000founded_month
>> funding_rounds2statusoperatingregionLos Angelesstate_codeCA
>> last_funding_at23/09/2010
>> On Tuesday, November 4, 2014 11:11:34 AM UTC+1, Mark Needham wrote:
>>>
>>> Hmm ok, what about:
>>>
>>> LOAD CSV WITH HEADERS FROM "file:c:/Users/Jean/Downloads/CompaniesTEST.csv"
>>> AS line
>>> WITH line WHERE line.founded_year = ""
>>> RETURN line
>>>
>>>
>>> On 4 November 2014 10:10, Jean Villedieu <[email protected]> wrote:
>>>
>>>> Thanks Mark. It returns nothing : Returned 0 rows in 276 ms
>>>>
>>>> On Tuesday, November 4, 2014 10:53:20 AM UTC+1, Mark Needham wrote:
>>>>>
>>>>> Try this:
>>>>>
>>>>> LOAD CSV WITH HEADERS FROM
>>>>> "file:c:/Users/Jean/Downloads/CompaniesTEST.csv"
>>>>> AS line
>>>>> WITH line WHERE line.founded_year is null
>>>>> RETURN line
>>>>>
>>>>> Does it return any rows?
>>>>>
>>>>> On 4 November 2014 09:39, Jean Villedieu <[email protected]> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I have already experimented with the CSV import and now I'm trying to
>>>>>> use it with the Crunchbase database
>>>>>> <http://info.crunchbase.com/2013/04/crunchbase-meets-excel/>.
>>>>>> Thanks to Michael Hunger's posts, I have managed to get started but I
>>>>>> keep getting this message :
>>>>>>
>>>>>> Cannot merge node using null property value for founded_year (Failure
>>>>>> when processing URL 'file:c:/Users/Jean/Downloads/CompaniesTEST.csv' on
>>>>>> line 4 (which is the last row in the file). No rows seem to have been
>>>>>> committed. Note that this information might not be accurate.)
>>>>>>
>>>>>> I think it has to do with null values.
>>>>>>
>>>>>> Here is what I used to do the importation :
>>>>>> create constraint on (a:Company) assert a.permalink is unique
>>>>>> create constraint on (b:Category) assert b.category_list is unique
>>>>>> create constraint on (c:MARKET) assert c.market is unique
>>>>>> create constraint on (d:STATUS) assert d.status is unique
>>>>>> create constraint on (e:COUNTRY) assert e.country_code is unique
>>>>>> create constraint on (f:STATE) assert f.state_code is unique
>>>>>> create constraint on (g:REGION) assert g.region is unique
>>>>>> create constraint on (h:CITY) assert h.city is unique
>>>>>> CREATE INDEX ON :Company(permalink)
>>>>>> USING PERIODIC COMMIT 1000
>>>>>> LOAD CSV WITH HEADERS FROM
>>>>>> "file:c:/Users/Jean/Downloads/CompaniesTEST.csv"
>>>>>> AS line
>>>>>> MERGE (a:COMPANY {permalink: line.permalink, funding_total:
>>>>>> line.funding_total_usd, funding_rounds: toInt(line.funding_rounds),
>>>>>> founded_at: line.founded_at, founded_month:
>>>>>> line.founded_month, founded_quarter: line.founded_quarter, founded_year:
>>>>>> toInt(line.founded_year), first_funding_at:
>>>>>> line.first_funding_at, last_funding_at: line.last_funding_at})
>>>>>> MERGE (b:CATEGORY {name: line.category_list})
>>>>>> MERGE (c:MARKET {name: line.market})
>>>>>> MERGE (d:STATUS {name: line.status})
>>>>>> MERGE (e:COUNTRY {name: line.country_code})
>>>>>> MERGE (f:STATE {name: line.state_code})
>>>>>> MERGE (g:REGION {name: line.region})
>>>>>> MERGE (h:CITY {name: line.city})
>>>>>> CREATE (a)-[:HAS_CATEGORY]->(b)
>>>>>> CREATE (a)-[:HAS_MARKET]->(c)
>>>>>> CREATE (a)-[:HAS_STATUS]->(d)
>>>>>> CREATE (a)-[:HAS_COUNTRY]->(e)
>>>>>> CREATE (a)-[:HAS_STATE]->(f)
>>>>>> CREATE (a)-[:HAS_REGION]->(g)
>>>>>> CREATE (a)-[:HAS_CITY]->(h)
>>>>>>
>>>>>> FYI when I create the COMPANY nodes with permalink as a single
>>>>>> property, it works. What am I doing wrong?
>>>>>> The csv file I'm using can be found here : https://gist.github.com/
>>>>>> jvilledieu/c3afe5bc21da28880a30
>>>>>>
>>>>>> Thanks for your help!
>>>>>>
>>>>>> --
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "Neo4j" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Neo4j" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.