Re: [Neo4j] LOAD CSV creates nodes but does not set properties

Michael Hunger Tue, 17 Jun 2014 11:36:25 -0700

The something is really wrong.

What happens if you do


>>>>>>>>>  
>>>>>>>>>  LOAD CSV WITH HEADERS FROM 
>>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
Limit 100
>>>>>>>>>  MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: 
>>>>>>>>> toInt(c.CityId)})
Return count(*)

I'm at a conference in Amsterdam this week
but perhaps we can do a skype call next week?

Michael



Sent from mobile device

Am 17.06.2014 um 18:48 schrieb Paul Damian <[email protected]>:

> Yes, I do. I keep getting Java heap space error now. I'm using 100 commit 
> size.
> 
> marți, 17 iunie 2014, 19:28:05 UTC+3, Michael Hunger a scris:
>> 
>> Ok, cool and you have the indexes for both :City(Id) and :Client(Id) ?
>> 
>> 
>> Michael
>> 
>> Am 17.06.2014 um 18:15 schrieb Paul Damian <[email protected]>:
>> 
>>> The first query returns 999996 which is the number of rows in the file and 
>>> the second one returns Neo.DatabaseError.Statement.ExecutionFailure
>>>  probably because of the null values. But then I run the following command:
>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>  MATCH (city:City { Id: toInt(c.CityId)})
>>> WHERE coalesce(c.CityId,"") <> ""
>>> RETURN count(*)
>>> 
>>> and I get 992980
>>> 
>>> 
>>> marți, 17 iunie 2014, 17:55:56 UTC+3, Michael Hunger a scris:
>>>> No you can just filter out the lines with no cityid
>>>> 
>>>> Did you run my suggested commands?
>>>> 
>>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>>  MATCH (client: Client { Id: toInt(c.Id)})
>>>>>>>> RETURN count(*)
>>>>>>>> 
>>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>>  MATCH (city: City { Id: toInt(c.CityId)})
>>>>>>>> RETURN count(*)
>>>> 
>>>>> 
>>>> 
>>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>> return c
>>>> limit 10
>>>> 
>>>> 
>>>> Am 17.06.2014 um 16:37 schrieb Paul Damian <[email protected]>:
>>>> 
>>>>> in the file I only have 2 columns, one for client id, which is always not 
>>>>> null and CityId, which may be sometimes null. Should I export the records 
>>>>> from SQL database leaving out the columns that contain null values?
>>>>> 
>>>>> marți, 17 iunie 2014, 15:39:14 UTC+3, Michael Hunger a scris:
>>>>>> 
>>>>>> if they don't have a value for city id, do they then have empty columns 
>>>>>> there still? like "user-id,,
>>>>>> 
>>>>>> You probably want to filter these rows?
>>>>>> 
>>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>> WHERE coalesce(c.CitiId,"") <> ""
>>>>>> ...
>>>>>> 
>>>>>> Am 17.06.2014 um 11:23 schrieb Paul Damian <[email protected]>:
>>>>>> 
>>>>>>> Well, the csv file contains some rows that do not have a value for 
>>>>>>> CityId, and the rows are unique regarding the clientID. There are 11M 
>>>>>>> clients living in 14K Cities. Is there a limit of links/node?
>>>>>>> Now I've created a piece of code that reads from file and creates each 
>>>>>>> relationship, but, as you can imagine, it works really slow in this 
>>>>>>> scenario.
>>>>>>>  
>>>>>>>> did you create an index on :Client(Id) and :City(Id)
>>>>>>>> 
>>>>>>>> what happens if you do:
>>>>>>>> 
>>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>>  MATCH (client: Client { Id: toInt(c.Id)})
>>>>>>>> RETURN count(*)
>>>>>>>> 
>>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>>  MATCH (city: City { Id: toInt(c.CityId)})
>>>>>>>> RETURN count(*)
>>>>>>>> 
>>>>>>>> each count should be equivalent to the # of rows in the file.
>>>>>>>> 
>>>>>>>> Michael
>>>>>>>> 
>>>>>>>> Am 16.06.2014 um 17:47 schrieb Paul Damian <[email protected]>:
>>>>>>>> 
>>>>>>>>> Somehow I've managed to load all the nodes and now I'm trying to load 
>>>>>>>>> the links as well. I read the nodes from csv file and create the 
>>>>>>>>> relation between them. I run the following command:
>>>>>>>>> USING PERIODIC COMMIT 100 
>>>>>>>>>  LOAD CSV WITH HEADERS FROM 
>>>>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
>>>>>>>>>  MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: 
>>>>>>>>> toInt(c.CityId)})
>>>>>>>>>  CREATE (client)-[r:LOCATED_IN]->(city)
>>>>>>>>> 
>>>>>>>>> Running with a smaller commit size returns this error 
>>>>>>>>> Neo.DatabaseError.Statement.ExecutionFailure, while increasing the 
>>>>>>>>> commit size to 10000 throws Neo.DatabaseError.General.UnknownFailure. 
>>>>>>>>> Can you help me with this?
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> joi, 5 iunie 2014, 12:05:18 UTC+3, Michael Hunger a scris:
>>>>>>>>>> 
>>>>>>>>>> Perhaps something with field or line terminators?
>>>>>>>>>> 
>>>>>>>>>> I assume it blows up the field separation.
>>>>>>>>>> 
>>>>>>>>>> Try to run:
>>>>>>>>>> 
>>>>>>>>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Client.csv" 
>>>>>>>>>> AS c
>>>>>>>>>> RETURN { Id: toInt(c.Id), FirstName: c.FirstName, LastName: 
>>>>>>>>>> c.Lastname, Address: c.Address, ZipCode: toInt(c.ZipCode), Email: 
>>>>>>>>>> c.Email, Phone: c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, 
>>>>>>>>>> URL: c.URL, Latitude: toFloat(c.Latitude), Longitude: 
>>>>>>>>>> toFloat(c.Longitude), AgencyId: toInt(c.AgencyId), RowStatus: 
>>>>>>>>>> toInt(c.RowStatus)} as data, c as line
>>>>>>>>>> LIMIT 3
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, Jun 5, 2014 at 10:51 AM, Paul Damian <[email protected]> 
>>>>>>>>>> wrote:
>>>>>>>>>>> I've tried using the shell and I get the same results: nodes with 
>>>>>>>>>>> no properties.
>>>>>>>>>>> I've created the csv file using MsSQL Server Export. Is it relevant?
>>>>>>>>>>> 
>>>>>>>>>>> About you curiosity: I figured I would import first the nodes, then 
>>>>>>>>>>> the relationships from the connection tables. Am I doing it wrong?
>>>>>>>>>>> 
>>>>>>>>>>> Thanks
>>>>>>>>>>> 
>>>>>>>>>>> joi, 5 iunie 2014, 09:54:31 UTC+3, Michael Hunger a scris:
>>>>>>>>>>>> 
>>>>>>>>>>>> I'd probably use a commit size in your case of 50k or 100k.
>>>>>>>>>>>> 
>>>>>>>>>>>> Try to use the neo4j-shell and not the web-interface.
>>>>>>>>>>>> 
>>>>>>>>>>>> Connect to neo4j using bin/neo4j-shell
>>>>>>>>>>>> 
>>>>>>>>>>>> Then run your commands ending with a semicolon.
>>>>>>>>>>>> 
>>>>>>>>>>>> Just curious: Your data is imported as one node per row? That's 
>>>>>>>>>>>> not really a graph structure.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Jun 4, 2014 at 6:56 PM, Paul Damian <[email protected]> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> Hi there,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm experimenting with Neo4j while benchmarking a bunch of NoSQL 
>>>>>>>>>>>>> databases for my graduation paper. 
>>>>>>>>>>>>> I'm using the web interface to populate the database. I've been 
>>>>>>>>>>>>> able to load the smaller tables from my SQL database and LOAD CSV 
>>>>>>>>>>>>> works fine.
>>>>>>>>>>>>> By small, I mean a few columns (4-5) and some rows (1 million). 
>>>>>>>>>>>>> However, when I try to upload a larger table (15 columns, 12 
>>>>>>>>>>>>> million rows), it creates the nodes but it doesn't set any 
>>>>>>>>>>>>> properties.
>>>>>>>>>>>>> I've tried to reduce the number of records (to 100) and also the 
>>>>>>>>>>>>> number of columns( just the Id property ), but no luck so far.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The cypher command used is this one
>>>>>>>>>>>>> USING PERIODIC COMMIT 100
>>>>>>>>>>>>> LOAD CSV WITH HEADERS FROM 
>>>>>>>>>>>>> "file:/Users/pauld/Documents/Client.csv" AS c
>>>>>>>>>>>>> CREATE (:Client { Id: toInt(c.Id), FirstName: c.FirstName, 
>>>>>>>>>>>>> LastName: c.Lastname, Address: c.Address, ZipCode: 
>>>>>>>>>>>>> toInt(c.ZipCode), Email: c.Email, Phone: c.Phone, Fax: c.Fax, 
>>>>>>>>>>>>> BusinessName: c.BusinessName, URL: c.URL, Latitude: 
>>>>>>>>>>>>> toFloat(c.Latitude), Longitude: toFloat(c.Longitude), AgencyId: 
>>>>>>>>>>>>> toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)})
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Any help and indication is welcomed,
>>>>>>>>>>>>> Paul
>>>>>>>>>>>>> 
>>>>>>>>>>>>> -- 
>>>>>>>>>>>>> You received this message because you are subscribed to the 
>>>>>>>>>>>>> Google Groups "Neo4j" group.
>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>>>>> send an email to [email protected].
>>>>>>>>>>>>> 
>>>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> -- 
>>>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>>>> Groups "Neo4j" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>>>> send an email to [email protected].
>>>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -- 
>>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>>> Groups "Neo4j" group.
>>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>>> send an email to [email protected].
>>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Neo4j" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>>> an email to [email protected].
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>> 
>>>>> 
>>>>> -- 
>>>>> You received this message because you are subscribed to the Google Groups 
>>>>> "Neo4j" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>>> email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>> 
>>> 
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] LOAD CSV creates nodes but does not set properties

Reply via email to