Excuse me, sorry. What I meant to say was that I accidentally left the ID restriction out in that last query and that this was a mistake on my part.
Now that I've run the original query for which I created this post in the shell, it's returning in a reasonable amount of time (about 1.2 seconds). Odd, as there seems to be a crucial difference between this and the web console... On Monday, November 17, 2014 1:47:18 PM UTC-5, Eric Gade wrote: > > Ahh yes, I've just noticed this now. That was an error in the query I let > run over the weekend. Normally I restrict the document on `source_id` by > > On Monday, November 17, 2014 9:30:00 AM UTC-5, Michael Hunger wrote: >> >> How many documents do you have and how many conns do they have >> min,max,avg? >> >> As you do a search across all docs and their mentions and then further >> out you have to multiply the number of rels >> >> In total that's up to 300k^2 paths you find >> >> Von meinem iPhone gesendet >> >> Am 17.11.2014 um 15:07 schrieb Eric Gade <[email protected]>: >> >> *UPDATE*: >> >> So I left this running on my digital ocean server over the weekend and >> I've just now checked it. Here's the result: >> >> neo4j-sh (?)$ profile MATCH (d:Document)-[*2]-(something) RETURN d, >> something; >> Error occurred in server thread; nested exception is: >> java.lang.OutOfMemoryError: Java heap space >> >> This seems really odd to me, as I thought I was using a more than >> reasonable about of heap space: >> >> wrapper.java.initmemory=2048 >> wrapper.java.maxmemory=4096 >> >> Of course maybe that's not enough, and I have no idea what I'm talking >> about. It just seems odd that a query that searches for a path with only 2 >> degrees of separation would be this much of a hassle. >> >> On Friday, November 14, 2014 4:39:26 PM UTC-5, Eric Gade wrote: >>> >>> Hey Michael, >>> >>> Because the query never actually finishes, I'm not sure I'm getting the >>> results you want. >>> >>> For vanilla Cypher: >>> >>> ==> GuardTimeoutException: timeout occurred (overtime=1) >>> >>> For the experimental profile: >>> >>> ==> GuardTimeoutException: timeout occurred (overtime=1075) >>> >>> >>> BTW, if I remove the timeout limit, the query will not return...or at >>> least not in some reasonable amount of time that I've been able to measure. >>> Go ahead and let me know what you think. I'm going to connect to my remote >>> server and let this command run for a while and see what happens. >>> >>> >>> On Thursday, November 13, 2014 11:25:30 AM UTC-5, Michael Hunger wrote: >>>> >>>> Would you be able to run neo4j-shell (or the old webui >>>> http://localhost:7474/webadmin -> console) and prefix your query with >>>> the profile keyword and send the output? >>>> >>>> also prefix it with profile cypher 2.1.experimental >>>> and do the same. >>>> >>>> Thanks a lot, >>>> >>>> Michael >>>> >>>> On Thu, Nov 13, 2014 at 3:18 PM, Eric Gade <[email protected]> wrote: >>>> >>>>> Hi Michael. >>>>> >>>>> Yes, indexed the `source_id` properties for all nodes using the exact >>>>> syntax you described. I did it after the fact though, meaning after I had >>>>> migrate data into the graph. I then went through and did MATCH(d) SET >>>>> d.source_id=d.source_id just to be safe. >>>>> >>>>> I'm sure sure what the terminology is for relationships exactly, but >>>>> mine are definitely vectors in that :MENTIONS and :CONTAINS have arrows >>>>> and >>>>> only go in one direction. For example, a document -[:MENTIONS]-> a >>>>> country, >>>>> but not the other way around. >>>>> >>>>> On Wednesday, November 12, 2014 8:47:59 PM UTC-5, Michael Hunger wrote: >>>>>> >>>>>> Hi Eric, >>>>>> >>>>>> did you do: >>>>>> >>>>>> create index on :Document(source_id); >>>>>> >>>>>> Also your relationships are they bi-directional between the same two >>>>>> nodes? >>>>>> >>>>>> On Wed, Nov 12, 2014 at 11:06 PM, Eric Gade <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hello. I have created what I believe is a not-terribly-complex Neo >>>>>>> database. If you want to cut to the chase, just scroll down to the >>>>>>> section >>>>>>> called "*The Problem*" >>>>>>> >>>>>>> Here is the structure: >>>>>>> >>>>>>> *Nodes* >>>>>>> >>>>>>> (:Document) ~75k >>>>>>> (:Country) ~300 >>>>>>> (:Person) ~8k >>>>>>> >>>>>>> *Relationships* >>>>>>> >>>>>>> -[:MENTIONS]-> ~300k >>>>>>> >>>>>>> *System Information* >>>>>>> >>>>>>> 16 Cores >>>>>>> 480gb HD >>>>>>> 48GB RAM >>>>>>> Ubuntu Server 14.04 LTS >>>>>>> Neo4j Version 2.1.5 >>>>>>> >>>>>>> *Config* >>>>>>> >>>>>>> I've adjusted for the config is the min and max heap size (disabled >>>>>>> by default) >>>>>>> Min: 2048 >>>>>>> Max: 4096 >>>>>>> >>>>>>> I set the max open files to 60000 from the default 1024 for my >>>>>>> system (Linux users know what I'm talking about) >>>>>>> >>>>>>> I set a max query time of two minutes via the >>>>>>> `org.neo4j.server.webserver.limit.executiontimeout` param, though I >>>>>>> only did this recently because many queries were taking longer than two >>>>>>> minutes. Prior to this, certain queries which I would guess should be >>>>>>> fast >>>>>>> would never finish (see below) >>>>>>> >>>>>>> I have also indexed a parameter on all nodes called `source_id`, >>>>>>> which is the `id` value for these things in the database from which I >>>>>>> imported them. >>>>>>> >>>>>>> >>>>>>> *Weird Observatons* >>>>>>> >>>>>>> Before I altered the max and min heap sizes in the config file, >>>>>>> `htop` was showing me some (alarming??) stats -- VIRT was 17.5GB for >>>>>>> the >>>>>>> server process. >>>>>>> Now, with the new settings, it's at a much lower 10100M, but I still >>>>>>> don't understand why. >>>>>>> >>>>>>> >>>>>>> *The Problem* >>>>>>> >>>>>>> Here's a simple query that never returns. I've waited as long as 5 >>>>>>> minutes and still nothing: >>>>>>> MATCH(d:Document)-[*2]-(something) >>>>>>> WHERE d.source_id='SOMEIDHERE' >>>>>>> RETURN d,something; >>>>>>> >>>>>>> Based on some of the queries I've seen other people talk about, with >>>>>>> variable relations in the dozens, and for datasets that have millions >>>>>>> of >>>>>>> nodes using laptop hardware, something seems very wrong to me here. >>>>>>> >>>>>>> I've read all of the articles I could find on configurations and >>>>>>> ways to improve performance. Any ideas? >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> You received this message because you are subscribed to the Google >>>>>>> Groups "Neo4j" group. >>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>> send an email to [email protected]. >>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>> >>>>>> >>>>>> -- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "Neo4j" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> For more options, visit https://groups.google.com/d/optout. >>>>> >>>> >>>> -- >> You received this message because you are subscribed to the Google Groups >> "Neo4j" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> >> -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
