Re: [Neo4j] Very Slow Simple Queries (System Details Inside)

Eric Gade Mon, 17 Nov 2014 06:08:18 -0800

*UPDATE*:

So I left this running on my digital ocean server over the weekend and I've 
just now checked it. Here's the result:


neo4j-sh (?)$ profile MATCH (d:Document)-[*2]-(something) RETURN d, 
something;
Error occurred in server thread; nested exception is:
        java.lang.OutOfMemoryError: Java heap space

This seems really odd to me, as I thought I was using a more than 
reasonable about of heap space:

wrapper.java.initmemory=2048
wrapper.java.maxmemory=4096

Of course maybe that's not enough, and I have no idea what I'm talking 
about. It just seems odd that a query that searches for a path with only 2 
degrees of separation would be this much of a hassle.

On Friday, November 14, 2014 4:39:26 PM UTC-5, Eric Gade wrote:
>
> Hey Michael,
>
> Because the query never actually finishes, I'm not sure I'm getting the 
> results you want.
>
> For vanilla Cypher:
>
> ==> GuardTimeoutException: timeout occurred (overtime=1)
>
> For the experimental profile:
>
> ==> GuardTimeoutException: timeout occurred (overtime=1075)
>
>
> BTW, if I remove the timeout limit, the query will not return...or at 
> least not in some reasonable amount of time that I've been able to measure. 
> Go ahead and let me know what you think. I'm going to connect to my remote 
> server and let this command run for a while and see what happens.
>
>
> On Thursday, November 13, 2014 11:25:30 AM UTC-5, Michael Hunger wrote:
>>
>> Would you be able to run neo4j-shell (or the old webui 
>> http://localhost:7474/webadmin -> console) and prefix your query with 
>> the profile keyword and send the output?
>>
>> also prefix it with profile cypher 2.1.experimental
>> and do the same.
>>
>> Thanks  a lot,
>>
>> Michael
>>
>> On Thu, Nov 13, 2014 at 3:18 PM, Eric Gade <[email protected]> wrote:
>>
>>> Hi Michael.
>>>
>>> Yes, indexed the `source_id` properties for all nodes using the exact 
>>> syntax you described. I did it after the fact though, meaning after I had 
>>> migrate data into the graph. I then went through and did MATCH(d) SET 
>>> d.source_id=d.source_id just to be safe.
>>>
>>> I'm sure sure what the terminology is for relationships exactly, but 
>>> mine are definitely vectors in that :MENTIONS and :CONTAINS have arrows and 
>>> only go in one direction. For example, a document -[:MENTIONS]-> a country, 
>>> but not the other way around.
>>>
>>> On Wednesday, November 12, 2014 8:47:59 PM UTC-5, Michael Hunger wrote:
>>>>
>>>> Hi Eric,
>>>>
>>>> did you do:
>>>>
>>>> create index on :Document(source_id);
>>>>
>>>> Also your relationships are they bi-directional between the same two 
>>>> nodes?
>>>>
>>>> On Wed, Nov 12, 2014 at 11:06 PM, Eric Gade <[email protected]> wrote:
>>>>
>>>>> Hello. I have created what I believe is a not-terribly-complex Neo 
>>>>> database. If you want to cut to the chase, just scroll down to the 
>>>>> section 
>>>>> called "*The Problem*"
>>>>>
>>>>> Here is the structure:
>>>>>
>>>>> *Nodes*
>>>>>
>>>>> (:Document) ~75k
>>>>> (:Country) ~300
>>>>> (:Person) ~8k
>>>>>
>>>>> *Relationships*
>>>>>
>>>>> -[:MENTIONS]-> ~300k
>>>>>
>>>>> *System Information*
>>>>>
>>>>> 16 Cores
>>>>> 480gb HD
>>>>> 48GB RAM
>>>>> Ubuntu Server 14.04 LTS
>>>>> Neo4j Version 2.1.5
>>>>>
>>>>> *Config*
>>>>>
>>>>> I've adjusted for the config is the min and max heap size (disabled by 
>>>>> default)
>>>>> Min: 2048
>>>>> Max: 4096
>>>>>
>>>>> I set the max open files to 60000 from the default 1024 for my system 
>>>>> (Linux users know what I'm talking about)
>>>>>
>>>>> I set a max query time of two minutes via the 
>>>>> `org.neo4j.server.webserver.limit.executiontimeout` param, though I 
>>>>> only did this recently because many queries were taking longer than two 
>>>>> minutes. Prior to this, certain queries which I would guess should be 
>>>>> fast 
>>>>> would never finish (see below)
>>>>>
>>>>> I have also indexed a parameter on all nodes called `source_id`, which 
>>>>> is the `id` value for these things in the database from which I imported 
>>>>> them.
>>>>>
>>>>>
>>>>> *Weird Observatons*
>>>>>
>>>>> Before I altered the max and min heap sizes in the config file, `htop` 
>>>>> was showing me some (alarming??) stats -- VIRT was 17.5GB for the server 
>>>>> process.
>>>>> Now, with the new settings, it's at a much lower 10100M, but I still 
>>>>> don't understand why.
>>>>>
>>>>>
>>>>> *The Problem*
>>>>>
>>>>> Here's a simple query that never returns. I've waited as long as 5 
>>>>> minutes and still nothing:
>>>>> MATCH(d:Document)-[*2]-(something)
>>>>> WHERE d.source_id='SOMEIDHERE'
>>>>> RETURN d,something;
>>>>>
>>>>> Based on some of the queries I've seen other people talk about, with 
>>>>> variable relations in the dozens, and for datasets that have millions of 
>>>>> nodes using laptop hardware, something seems very wrong to me here.
>>>>>
>>>>> I've read all of the articles I could find on configurations and ways 
>>>>> to improve performance. Any ideas?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  -- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "Neo4j" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>  -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Neo4j" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Very Slow Simple Queries (System Details Inside)

Reply via email to