Would you be able to run neo4j-shell (or the old webui http://localhost:7474/webadmin -> console) and prefix your query with the profile keyword and send the output?
also prefix it with profile cypher 2.1.experimental and do the same. Thanks a lot, Michael On Thu, Nov 13, 2014 at 3:18 PM, Eric Gade <[email protected]> wrote: > Hi Michael. > > Yes, indexed the `source_id` properties for all nodes using the exact > syntax you described. I did it after the fact though, meaning after I had > migrate data into the graph. I then went through and did MATCH(d) SET > d.source_id=d.source_id just to be safe. > > I'm sure sure what the terminology is for relationships exactly, but mine > are definitely vectors in that :MENTIONS and :CONTAINS have arrows and only > go in one direction. For example, a document -[:MENTIONS]-> a country, but > not the other way around. > > On Wednesday, November 12, 2014 8:47:59 PM UTC-5, Michael Hunger wrote: >> >> Hi Eric, >> >> did you do: >> >> create index on :Document(source_id); >> >> Also your relationships are they bi-directional between the same two >> nodes? >> >> On Wed, Nov 12, 2014 at 11:06 PM, Eric Gade <[email protected]> wrote: >> >>> Hello. I have created what I believe is a not-terribly-complex Neo >>> database. If you want to cut to the chase, just scroll down to the section >>> called "*The Problem*" >>> >>> Here is the structure: >>> >>> *Nodes* >>> >>> (:Document) ~75k >>> (:Country) ~300 >>> (:Person) ~8k >>> >>> *Relationships* >>> >>> -[:MENTIONS]-> ~300k >>> >>> *System Information* >>> >>> 16 Cores >>> 480gb HD >>> 48GB RAM >>> Ubuntu Server 14.04 LTS >>> Neo4j Version 2.1.5 >>> >>> *Config* >>> >>> I've adjusted for the config is the min and max heap size (disabled by >>> default) >>> Min: 2048 >>> Max: 4096 >>> >>> I set the max open files to 60000 from the default 1024 for my system >>> (Linux users know what I'm talking about) >>> >>> I set a max query time of two minutes via the >>> `org.neo4j.server.webserver.limit.executiontimeout` param, though I >>> only did this recently because many queries were taking longer than two >>> minutes. Prior to this, certain queries which I would guess should be fast >>> would never finish (see below) >>> >>> I have also indexed a parameter on all nodes called `source_id`, which >>> is the `id` value for these things in the database from which I imported >>> them. >>> >>> >>> *Weird Observatons* >>> >>> Before I altered the max and min heap sizes in the config file, `htop` >>> was showing me some (alarming??) stats -- VIRT was 17.5GB for the server >>> process. >>> Now, with the new settings, it's at a much lower 10100M, but I still >>> don't understand why. >>> >>> >>> *The Problem* >>> >>> Here's a simple query that never returns. I've waited as long as 5 >>> minutes and still nothing: >>> MATCH(d:Document)-[*2]-(something) >>> WHERE d.source_id='SOMEIDHERE' >>> RETURN d,something; >>> >>> Based on some of the queries I've seen other people talk about, with >>> variable relations in the dozens, and for datasets that have millions of >>> nodes using laptop hardware, something seems very wrong to me here. >>> >>> I've read all of the articles I could find on configurations and ways to >>> improve performance. Any ideas? >>> >>> >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Neo4j" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
