The first query returns 999996 which is the number of rows in the file and
the second one returns Neo.DatabaseError.Statement.ExecutionFailure
probably because of the null values. But then I run the following command:
LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c
MATCH (city:City { Id: toInt(c.CityId)})
WHERE coalesce(c.CityId,"") <> ""
RETURN count(*)
and I get 992980
marți, 17 iunie 2014, 17:55:56 UTC+3, Michael Hunger a scris:
> No you can just filter out the lines with no cityid
>
> Did you run my suggested commands?
>
> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" AS
>>> c
>>> MATCH (client: Client { Id: toInt(c.Id)})
>>>
>>> RETURN count(*)
>>>
>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv"
>>> AS c
>>> MATCH (city: City { Id: toInt(c.CityId)})
>>>
>>> RETURN count(*)
>>>
>>
>>>
>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv"
>>> AS c
>>>
>>> return c
> limit 10
>
>
>>> Am 17.06.2014 um 16:37 schrieb Paul Damian <[email protected]
> <javascript:>>:
>
> in the file I only have 2 columns, one for client id, which is always not
> null and CityId, which may be sometimes null. Should I export the records
> from SQL database leaving out the columns that contain null values?
>
> marți, 17 iunie 2014, 15:39:14 UTC+3, Michael Hunger a scris:
>>
>> if they don't have a value for city id, do they then have empty columns
>> there still? like "user-id,,
>>
>> You probably want to filter these rows?
>>
>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv"
>>> AS c
>>>
>>> WHERE coalesce(c.CitiId,"") <> ""
>> ...
>>
>> Am 17.06.2014 um 11:23 schrieb Paul Damian <[email protected]>:
>>
>> Well, the csv file contains some rows that do not have a value for
>> CityId, and the rows are unique regarding the clientID. There are 11M
>> clients living in 14K Cities. Is there a limit of links/node?
>> Now I've created a piece of code that reads from file and creates each
>> relationship, but, as you can imagine, it works really slow in this
>> scenario.
>>
>>
>>> did you create an index on :Client(Id) and :City(Id)
>>>
>>> what happens if you do:
>>>
>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv"
>>> AS c
>>> MATCH (client: Client { Id: toInt(c.Id)})
>>>
>>> RETURN count(*)
>>>
>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv"
>>> AS c
>>> MATCH (city: City { Id: toInt(c.CityId)})
>>>
>>> RETURN count(*)
>>>
>>> each count should be equivalent to the # of rows in the file.
>>>
>>> Michael
>>>
>>> Am 16.06.2014 um 17:47 schrieb Paul Damian <[email protected]>:
>>>
>>> Somehow I've managed to load all the nodes and now I'm trying to load
>>> the links as well. I read the nodes from csv file and create the relation
>>> between them. I run the following command:
>>> USING PERIODIC COMMIT 100
>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv"
>>> AS c
>>> MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id:
>>> toInt(c.CityId)})
>>> CREATE (client)-[r:LOCATED_IN]->(city)
>>>
>>> Running with a smaller commit size returns this error
>>> Neo.DatabaseError.Statement.ExecutionFailure, while increasing the
>>> commit size to 10000 throws Neo.DatabaseError.General.UnknownFailure.
>>> Can you help me with this?
>>>
>>>
>>> joi, 5 iunie 2014, 12:05:18 UTC+3, Michael Hunger a scris:
>>>>
>>>> Perhaps something with field or line terminators?
>>>>
>>>> I assume it blows up the field separation.
>>>>
>>>> Try to run:
>>>>
>>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Client.csv" AS
>>>> c
>>>> RETURN { Id: toInt(c.Id), FirstName: c.FirstName, LastName: c.Lastname,
>>>> Address: c.Address, ZipCode: toInt(c.ZipCode), Email: c.Email, Phone:
>>>> c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, URL: c.URL, Latitude:
>>>> toFloat(c.Latitude), Longitude: toFloat(c.Longitude), AgencyId:
>>>> toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)} as data, c as line
>>>> LIMIT 3
>>>>
>>>>
>>>>
>>>> On Thu, Jun 5, 2014 at 10:51 AM, Paul Damian <[email protected]>
>>>> wrote:
>>>>
>>>>> I've tried using the shell and I get the same results: nodes with no
>>>>> properties.
>>>>> I've created the csv file using MsSQL Server Export. Is it relevant?
>>>>>
>>>>> About you curiosity: I figured I would import first the nodes, then
>>>>> the relationships from the connection tables. Am I doing it wrong?
>>>>>
>>>>> Thanks
>>>>>
>>>>> joi, 5 iunie 2014, 09:54:31 UTC+3, Michael Hunger a scris:
>>>>>>
>>>>>> I'd probably use a commit size in your case of 50k or 100k.
>>>>>>
>>>>>> Try to use the neo4j-shell and not the web-interface.
>>>>>>
>>>>>> Connect to neo4j using bin/neo4j-shell
>>>>>>
>>>>>> Then run your commands ending with a semicolon.
>>>>>>
>>>>>> Just curious: Your data is imported as one node per row? That's not
>>>>>> really a graph structure.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Jun 4, 2014 at 6:56 PM, Paul Damian <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi there,
>>>>>>>
>>>>>>> I'm experimenting with Neo4j while benchmarking a bunch of NoSQL
>>>>>>> databases for my graduation paper.
>>>>>>> I'm using the web interface to populate the database. I've been able
>>>>>>> to load the smaller tables from my SQL database and LOAD CSV works fine.
>>>>>>> By small, I mean a few columns (4-5) and some rows (1 million).
>>>>>>> However, when I try to upload a larger table (15 columns, 12 million
>>>>>>> rows),
>>>>>>> it creates the nodes but it doesn't set any properties.
>>>>>>> I've tried to reduce the number of records (to 100) and also the
>>>>>>> number of columns( just the Id property ), but no luck so far.
>>>>>>>
>>>>>>> The cypher command used is this one
>>>>>>> USING PERIODIC COMMIT 100
>>>>>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Client.csv"
>>>>>>> AS c
>>>>>>> CREATE (:Client { Id: toInt(c.Id), FirstName: c.FirstName, LastName:
>>>>>>> c.Lastname, Address: c.Address, ZipCode: toInt(c.ZipCode), Email:
>>>>>>> c.Email,
>>>>>>> Phone: c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, URL: c.URL,
>>>>>>> Latitude: toFloat(c.Latitude), Longitude: toFloat(c.Longitude),
>>>>>>> AgencyId:
>>>>>>> toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)})
>>>>>>>
>>>>>>> Any help and indication is welcomed,
>>>>>>> Paul
>>>>>>>
>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "Neo4j" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>>
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Neo4j" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected] <javascript:>.
> For more options, visit https://groups.google.com/d/optout.
>
>
>
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.