Hey, I'm trying to run a command to find out 10 clients and the companies they work for. I've used a query like this: match (c: Client)-[WORKS_FOR]->(co: Company) return c, co limit 10 However, it keeps returning Java heap space error. Neo4j is installed on a vm with windows server 2012R2 Intel Xeon @ 2.27 GHz and 8 GB of RAM. The graph db has over 30 GB (which is also weird since the SQL database that was used to populate the graph only has 13 GB). What can I do to improve the query performance beside adding indexes?
miercuri, 18 iunie 2014, 16:34:10 UTC+3, Michael Hunger a scris: > > For me it sounds as if there is a big cross product happening. > > I.e. many Verticals with the same Id > > What happens if you do: > > MATCH (v:Vertical) > RETURN v.Id, count(*) > > Michael > > Am 18.06.2014 um 15:26 schrieb Paul Damian <[email protected] > <javascript:>>: > > Hi, > > I've tried with another file, which contains ClientdId and VerticalId. The > thing is, there are only 7 verticals and 11M clients, so there is an > obvious one-to-many relationship there. > When I run > LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Vertical.csv" AS c > WITH c LIMIT 100 > MATCH (cli: Client { Id: toInt(c.ClientId)}), (vert: Vertical { Id: > toInt(c.VerticalId)}) > Return count(*) > it return Neo.DatabaseError.Statement.ExecutionFailure > I get the same result when I only match the verticals. > However, if I run > LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Vertical.csv" AS c > WITH c LIMIT 100 > MATCH (cli: Client { Id: toInt(c.ClientId)}) > Return count(*) > it returns 100. > I think it has something to do with the fact that the first 100 verticals > have the same Id > > miercuri, 18 iunie 2014, 14:20:57 UTC+3, Michael Hunger a scris: >> >> sorry >> >> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >> AS c >> WITH c >> LIMIT 100 >> MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: >> toInt(c.CityId)}) >> Return count(*) >> >> >> Am 18.06.2014 um 11:44 schrieb Paul Damian <[email protected]>: >> >> I cannot run this command. It returns invalid syntax. Only way I could >> run it was >> >> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >> AS c >> MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: >> toInt(c.CityId)}) >> Return count(*) Limit 100 >> >> Also, I think a skype call would be great. >> >> marți, 17 iunie 2014, 21:36:05 UTC+3, Michael Hunger a scris: >>> >>> The something is really wrong. >>> >>> What happens if you do >>> >>> >>>>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c >>>>>>> >>>>>>> Limit 100 >>> >>> MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: >>>>>>> toInt(c.CityId)}) >>>>>>> >>>>>>> Return count(*) >>> >>> I'm at a conference in Amsterdam this week >>> but perhaps we can do a skype call next week? >>> >>> Michael >>> >>> >>> >>> Sent from mobile device >>> >>> Am 17.06.2014 um 18:48 schrieb Paul Damian <[email protected]>: >>> >>> Yes, I do. I keep getting Java heap space error now. I'm using 100 >>> commit size. >>> >>> marți, 17 iunie 2014, 19:28:05 UTC+3, Michael Hunger a scris: >>>> >>>> Ok, cool and you have the indexes for both :City(Id) and :Client(Id) ? >>>> >>>> >>>> Michael >>>> >>>> Am 17.06.2014 um 18:15 schrieb Paul Damian <[email protected]>: >>>> >>>> The first query returns 999996 which is the number of rows in the file >>>> and the second one returns Neo.DatabaseError.Statement.ExecutionFailure >>>> probably because of the null values. But then I run the following >>>> command: >>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/LOCATED_IN.csv" >>>> AS c >>>> MATCH (city:City { Id: toInt(c.CityId)}) >>>> WHERE coalesce(c.CityId,"") <> "" >>>> RETURN count(*) >>>> >>>> and I get 992980 >>>> >>>> >>>> marți, 17 iunie 2014, 17:55:56 UTC+3, Michael Hunger a scris: >>>> >>>>> No you can just filter out the lines with no cityid >>>>> >>>>> Did you run my suggested commands? >>>>> >>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c >>>>>>> MATCH (client: Client { Id: toInt(c.Id)}) >>>>>>> >>>>>>> RETURN count(*) >>>>>>> >>>>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c >>>>>>> MATCH (city: City { Id: toInt(c.CityId)}) >>>>>>> >>>>>>> RETURN count(*) >>>>>>> >>>>>> >>>>>>> >>>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c >>>>>>> >>>>>>> return c >>>>> limit 10 >>>>> >>>>> >>>>>>> Am 17.06.2014 um 16:37 schrieb Paul Damian <[email protected]>: >>>>> >>>>> in the file I only have 2 columns, one for client id, which is always >>>>> not null and CityId, which may be sometimes null. Should I export the >>>>> records from SQL database leaving out the columns that contain null >>>>> values? >>>>> >>>>> marți, 17 iunie 2014, 15:39:14 UTC+3, Michael Hunger a scris: >>>>>> >>>>>> if they don't have a value for city id, do they then have empty >>>>>> columns there still? like "user-id,, >>>>>> >>>>>> You probably want to filter these rows? >>>>>> >>>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c >>>>>>> >>>>>>> WHERE coalesce(c.CitiId,"") <> "" >>>>>> ... >>>>>> >>>>>> Am 17.06.2014 um 11:23 schrieb Paul Damian <[email protected]>: >>>>>> >>>>>> Well, the csv file contains some rows that do not have a value for >>>>>> CityId, and the rows are unique regarding the clientID. There are 11M >>>>>> clients living in 14K Cities. Is there a limit of links/node? >>>>>> Now I've created a piece of code that reads from file and creates >>>>>> each relationship, but, as you can imagine, it works really slow in this >>>>>> scenario. >>>>>> >>>>>> >>>>>>> did you create an index on :Client(Id) and :City(Id) >>>>>>> >>>>>>> what happens if you do: >>>>>>> >>>>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c >>>>>>> MATCH (client: Client { Id: toInt(c.Id)}) >>>>>>> >>>>>>> RETURN count(*) >>>>>>> >>>>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c >>>>>>> MATCH (city: City { Id: toInt(c.CityId)}) >>>>>>> >>>>>>> RETURN count(*) >>>>>>> >>>>>>> each count should be equivalent to the # of rows in the file. >>>>>>> >>>>>>> Michael >>>>>>> >>>>>>> Am 16.06.2014 um 17:47 schrieb Paul Damian <[email protected]>: >>>>>>> >>>>>>> Somehow I've managed to load all the nodes and now I'm trying to >>>>>>> load the links as well. I read the nodes from csv file and create the >>>>>>> relation between them. I run the following command: >>>>>>> USING PERIODIC COMMIT 100 >>>>>>> LOAD CSV WITH HEADERS FROM >>>>>>> "file:/Users/pauld/Documents/LOCATED_IN.csv" AS c >>>>>>> MATCH (client: Client { Id: toInt(c.Id)}), (city: City { Id: >>>>>>> toInt(c.CityId)}) >>>>>>> CREATE (client)-[r:LOCATED_IN]->(city) >>>>>>> >>>>>>> Running with a smaller commit size returns this error >>>>>>> Neo.DatabaseError.Statement.ExecutionFailure, while increasing the >>>>>>> commit size to 10000 throws >>>>>>> Neo.DatabaseError.General.UnknownFailure. >>>>>>> Can you help me with this? >>>>>>> >>>>>>> >>>>>>> joi, 5 iunie 2014, 12:05:18 UTC+3, Michael Hunger a scris: >>>>>>>> >>>>>>>> Perhaps something with field or line terminators? >>>>>>>> >>>>>>>> I assume it blows up the field separation. >>>>>>>> >>>>>>>> Try to run: >>>>>>>> >>>>>>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Client.csv" >>>>>>>> AS c >>>>>>>> RETURN { Id: toInt(c.Id), FirstName: c.FirstName, LastName: >>>>>>>> c.Lastname, Address: c.Address, ZipCode: toInt(c.ZipCode), Email: >>>>>>>> c.Email, >>>>>>>> Phone: c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, URL: c.URL, >>>>>>>> Latitude: toFloat(c.Latitude), Longitude: toFloat(c.Longitude), >>>>>>>> AgencyId: >>>>>>>> toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)} as data, c as line >>>>>>>> LIMIT 3 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jun 5, 2014 at 10:51 AM, Paul Damian <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I've tried using the shell and I get the same results: nodes with >>>>>>>>> no properties. >>>>>>>>> I've created the csv file using MsSQL Server Export. Is it >>>>>>>>> relevant? >>>>>>>>> >>>>>>>>> About you curiosity: I figured I would import first the nodes, >>>>>>>>> then the relationships from the connection tables. Am I doing it >>>>>>>>> wrong? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> joi, 5 iunie 2014, 09:54:31 UTC+3, Michael Hunger a scris: >>>>>>>>>> >>>>>>>>>> I'd probably use a commit size in your case of 50k or 100k. >>>>>>>>>> >>>>>>>>>> Try to use the neo4j-shell and not the web-interface. >>>>>>>>>> >>>>>>>>>> Connect to neo4j using bin/neo4j-shell >>>>>>>>>> >>>>>>>>>> Then run your commands ending with a semicolon. >>>>>>>>>> >>>>>>>>>> Just curious: Your data is imported as one node per row? That's >>>>>>>>>> not really a graph structure. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Jun 4, 2014 at 6:56 PM, Paul Damian <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi there, >>>>>>>>>>> >>>>>>>>>>> I'm experimenting with Neo4j while benchmarking a bunch of NoSQL >>>>>>>>>>> databases for my graduation paper. >>>>>>>>>>> I'm using the web interface to populate the database. I've been >>>>>>>>>>> able to load the smaller tables from my SQL database and LOAD CSV >>>>>>>>>>> works >>>>>>>>>>> fine. >>>>>>>>>>> By small, I mean a few columns (4-5) and some rows (1 million). >>>>>>>>>>> However, when I try to upload a larger table (15 columns, 12 >>>>>>>>>>> million rows), >>>>>>>>>>> it creates the nodes but it doesn't set any properties. >>>>>>>>>>> I've tried to reduce the number of records (to 100) and also the >>>>>>>>>>> number of columns( just the Id property ), but no luck so far. >>>>>>>>>>> >>>>>>>>>>> The cypher command used is this one >>>>>>>>>>> USING PERIODIC COMMIT 100 >>>>>>>>>>> LOAD CSV WITH HEADERS FROM "file:/Users/pauld/Documents/Client.csv" >>>>>>>>>>> AS c >>>>>>>>>>> CREATE (:Client { Id: toInt(c.Id), FirstName: c.FirstName, >>>>>>>>>>> LastName: c.Lastname, Address: c.Address, ZipCode: >>>>>>>>>>> toInt(c.ZipCode), Email: >>>>>>>>>>> c.Email, Phone: c.Phone, Fax: c.Fax, BusinessName: c.BusinessName, >>>>>>>>>>> URL: >>>>>>>>>>> c.URL, Latitude: toFloat(c.Latitude), Longitude: >>>>>>>>>>> toFloat(c.Longitude), >>>>>>>>>>> AgencyId: toInt(c.AgencyId), RowStatus: toInt(c.RowStatus)}) >>>>>>>>>>> >>>>>>>>>>> Any help and indication is welcomed, >>>>>>>>>>> Paul >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> You received this message because you are subscribed to the >>>>>>>>>>> Google Groups "Neo4j" group. >>>>>>>>>>> To unsubscribe from this group and stop receiving emails from >>>>>>>>>>> it, send an email to [email protected]. >>>>>>>>>>> >>>>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> -- >>>>>>>>> You received this message because you are subscribed to the Google >>>>>>>>> Groups "Neo4j" group. >>>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>>> send an email to [email protected]. >>>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Neo4j" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>>> >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "Neo4j" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>>> >>>>>> >>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Neo4j" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>>> >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Neo4j" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>>> >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> >> >> > -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:>. > For more options, visit https://groups.google.com/d/optout. > > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
