Re: [Neo4j] How does query profiler work?

Ryan John Velasco Tue, 24 Mar 2015 01:10:14 -0700

Hello,

I was able to speed up the queries by enable 
relationship_auto_indexing=true
relationship_keys_indexable=StartDate,EndDate


And also by adding Using index on those queries that have "In" and "Equals =".
The other thing I did is to not have possibility of null not thereby speeding 
the query.

Thanks for the help.
Ryan

----- Original Message -----
From: "Michael Hunger" <[email protected]>
To: "Ryan John Velasco" <[email protected]>
Sent: Thursday, March 19, 2015 7:18:01 PM
Subject: Re: [Neo4j] How does query profiler work?

Indexes for relationships are in discussion but no decision made yet. 


Your point about adding a label for ":Current" or ":Active" nodes would make a 
big difference. 
As checking the label is really fast compared to props. 


for the comparison expression 
comparing something with null is always false and NOT() of that is always true 
that's why I thought it could help. 


Michael 





Am 19.03.2015 um 11:25 schrieb Ryan John Velasco < [email protected] >: 


Please see my comments. 

----- Original Message -----

From: "Michael Hunger" < [email protected] > 
To: "Ryan John Velasco" < [email protected] > 
Sent: Thursday, March 19, 2015 5:44:30 PM 
Subject: Re: [Neo4j] How does query profiler work? 

I wonder why it doesn't use the index on :Company.ID -> it says it uses a 
Label-Scan instead. 

can you please run ":schema" in the browser 

can you expand all operations? 

and perhaps try this hint after the match 

USING INDEX T0:Company(ID) 

? 
> Am 19.03.2015 um 10:30 schrieb Ryan John Velasco < [email protected] >: 
> 
> I put Index on all nodes. StartDate and End Date are on relationship but 
> sadly relationship doesn't also have index and also your index is like a hash 
> table right? 
the index is a b-tree 
Thanks for answering this. Is there a plan to also support range query index 
for relationship? All our relationship have start and end date. 
> We decided not to make Time Tree because our use case it range type and it 
> will complicate our queries. It is also difficult during exporting because a 
> node is not valid for certain date but a range of date. 
ok 
> The solution I have in mind for now to have smaller set of result is to have 
> a relationship type for example for Knows relationship to have 
> KnowsPossiblyValidForTheMoment and KnowsEnded 
I don't understand? 

Example is that Subscription will have SubscriptionPossiblyValidForTheMoment 
(during export we will check if the start and end date is already past and if 
not then it could be valid for the future. 
Most of our queries work with relationship that is valid for the moment. so 
instead of queries like (subscription)-[]-() we can use 
(SubscriptionPossiblyValidForTheMoment )-[]-() to have smaller subset of nodes 
by using label or relationship type. 

btw you can probably simplify your query by using 

>> (T4.StartDate IS NULL or T4.StartDate <= 635623428460000000) 
-> 
>> NOT(T4.StartDate > 635623428460000000) 
as for NULL values the comparison evaluates to false and so the total 
expression will be true 

not sure if that helps 

So if T4.StartDate is null it will be true right? This will help. Will this 
makes the query faster? So this will also work NOT(T4.EndDate< 
635623428460000000) 

> 
> Best Regards, 
> Ryan 
> 
> ----- Original Message -----

> From: "Michael Hunger" < [email protected] > 
> To: "Ryan John Velasco" < [email protected] > 
> Sent: Thursday, March 19, 2015 4:44:16 PM 
> Subject: Re: [Neo4j] How does query profiler work? 
> 
> What is the profile/explain output? 
> 
> Depending on your data this can result in many billions of paths to be 
> checked 
> 
> Add directions to your rels 
> 
> Is there an index/constraint on :Company.ID ? Add one if not 
> 
> In general all your _nodes_ can be attached to a time tree 
> 
> 
> Von meinem iPhone gesendet 
> 
>> Am 19.03.2015 um 08:21 schrieb Ryan John Velasco < [email protected] >: 
>> 
>> Hello, 
>> 
>> The actual query involving subscriptions for example. 
>> PROFILE MATCH 
>> (T0:Company)-[T2:Owner]-(T3:OwnedItems)-[T4:OwnedItem]-(T6:Fleet)-[T7:FleetGroup]-(T8:FleetGrouping)-[T9:GroupedSite]-(T11:Vessel)-[T12:At]-(T13:LocatedAt)-[T14:Located]-(T16:Invoiceable)-[T17:Goods]-(T19:SoldGoods)-[T20:Sold]-(T22:Subscription)-[T24:Subscription]-(T26:SalesSettings)-[T27:Contract]-(T28:Contract)-[T29:PriceAgreement]-(T31:SalesRelation)-[T32:Buyer]-(T33:Company)
>>  
>> WHERE T0.ID in [ 
>> 1, 
>> 2, 
>> 172076, 
>> 172079 
>> ] and (T4.StartDate IS NULL or T4.StartDate <= 635623428460000000) and 
>> (T4.EndDate IS NULL or T4.EndDate >= 635623428460000000) and (T9.StartDate 
>> IS NULL or T9.StartDate <= 635623428460000000) and (T9.EndDate IS NULL or 
>> T9.EndDate >= 635623428460000000) and (T14.StartDate IS NULL or 
>> T14.StartDate <= 635623428460000000) and (T14.EndDate IS NULL or T14.EndDate 
>> >= 635623428460000000) and (T17.StartDate IS NULL or T17.StartDate <= 
>> 635623428460000000) and (T17.EndDate IS NULL or T17.EndDate >= 
>> 635623428460000000) and (T20.StartDate IS NULL or T20.StartDate <= 
>> 635623428460000000) and (T20.EndDate IS NULL or T20.EndDate >= 
>> 635623428460000000) and (T22.SubscriptionStartDate IS NULL or 
>> T22.SubscriptionStartDate <= 635623428460000000) and 
>> (T22.SubscriptionEndDate IS NULL or T22.SubscriptionEndDate >= 
>> 635623428460000000) and (T24.StartDate IS NULL or T24.StartDate <= 
>> 635623428460000000) and (T24.EndDate IS NULL or T24.EndDate >= 
>> 635623428460000000) and (T29.StartDate IS NULL or T29.StartDate <= 
>> 635623428460000000) and (T29.EndDate IS NULL or T29.EndDate >= 
>> 635623428460000000) 
>> RETURN distinct T33.ID 
>> 
>> 
>> Would the time tree still applicable? 
>> 
>> Best Regards, 
>> Ryan 
>> 
>> 
>> ----- Original Message ----- 
>> From: "Michael Hunger" < [email protected] > 
>> To: "Ryan John Velasco" < [email protected] > 
>> Cc: [email protected] 
>> Sent: Wednesday, March 18, 2015 7:51:44 PM 
>> Subject: Re: [Neo4j] How does query profiler work? 
>> 
>> So whenever you don't need a property it's not loaded and the traversal is 
>> faster. 
>> 
>> Properties are stored in a linked list of property-records that can contain 
>> up to 4 properties each (depending on size). 
>> 
>> Range indexes are planned, but I can't give a timeline. 
>> 
>> Michael 
>> 
>>> Am 18.03.2015 um 10:35 schrieb Ryan John Velasco < [email protected] >: 
>>> 
>>> Is there a plan for someday support range queries? 
>>> How does neo4j save the properties of node? is it also via link to each 
>>> property or like a document(all property in a single store)? 
>>> 
>>> Best Regards, 
>>> Ryan 
>>> 
>>> ----- Original Message ----- 
>>> From: "Michael Hunger" < [email protected] > 
>>> To: [email protected] 
>>> Cc: "Ryan Velasco" < [email protected] > 
>>> Sent: Wednesday, March 18, 2015 5:30:20 PM 
>>> Subject: Re: [Neo4j] How does query profiler work? 
>>> 
>>> Right, you don't want to do that, there is currently no index support for 
>>> range queries. 
>>> Your query pulls all nodes and their properties into memory and does the 
>>> comparison there. 
>>> 
>>> Usually you have some other criteria to limit the search first. 
>>> For this concrete use-case it seems that you're looking at subscriptions 
>>> outside of a certain time range, you can also tag them with an additional 
>>> label 
>>> 
>>> I suggest that you either add something like a time-tree to your graph to 
>>> structure your subscriptions. 
>>> 
>>> see: http://neo4j.com/docs/stable/cypher-cookbook-path-tree.html 
>>> or http://graphaware.com/neo4j/2014/08/20/graphaware-neo4j-timetree.html 
>>> 
>>> Or you store a lower resolution date property e.g. down to the year, month 
>>> or day level, index it 
>>> 
>>> create index on :Subscription(start); 
>>> 
>>> and do a lookup via 
>>> (s.start IN range(2012,2000)) 
>>> 
>>>> Am 18.03.2015 um 10:22 schrieb Ryan John Velasco < [email protected] >: 
>>>> 
>>>> Thanks, I tried to put a Profile is some of our queries and there are 
>>>> queries that have 3M hits. Sample query is 
>>>> MATCH (n:Subscription) 
>>>> WHERE (n.SubscriptionStartDate IS NULL or n.SubscriptionStartDate <= 
>>>> 635622637370000000) and (n.SubscriptionEndDate IS NULL or 
>>>> n.SubscriptionEndDate >= 635622637370000000) 
>>>> RETURN count(n) 
>>>> 
>>>> Best Regards, 
>>>> Ryan John Velasco 
>>>> 
>>>> ----- Original Message ----- 
>>>> From: "Michael Hunger" < [email protected] > 
>>>> To: [email protected] 
>>>> Sent: Tuesday, March 17, 2015 9:20:12 PM 
>>>> Subject: Re: [Neo4j] How does query profiler work? 
>>>> 
>>>> There is some more logging of queries in more recent versions. 
>>>> 
>>>> 
>>>> in neo4j.properties 
>>>> 
>>>> 
>>>> 
>>>> dbms.querylog.enabled=true 
>>>> # in ms 
>>>> dbms.querylog.threshold=500 
>>>> dbms.querylog.path=data/log/queries.log 
>>>> 
>>>> 
>>>> in neo4j 2.2 you can prepend EXPLAIN to a query which will show you the 
>>>> query plan visually (in the browser) or textually (in Neo4j shell) it 
>>>> doesn't run the query. 
>>>> 
>>>> 
>>>> If you use PROFILE it will run the query and show also the db-hits. 
>>>> 
>>>> 
>>>> As you haven't shared your queries or data model, there is not more I can 
>>>> help you with. 
>>>> 
>>>> 
>>>> Cheers; Michael 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Am 17.03.2015 um 05:14 schrieb Ryan Velasco < [email protected] >: 
>>>> 
>>>> 
>>>> Thanks for the reply. 
>>>> How do you do query profiling? I have observed that if sql server is 
>>>> eating the memory the neo4j query goes slow. Maybe in production it will 
>>>> be different. Because we plan to dedicate a machine with only neo4j 
>>>> installed on it. Do you have a good specs for a computer? We plan to use a 
>>>> machine with Core i7 and 8GB of memory. Is there a feature that I can view 
>>>> history of queries made and the time it took to retrieve the data? 
>>>> 
>>>> 
>>>> Thanks, 
>>>> Ryan 
>>>> 
>>>> On Monday, March 16, 2015 at 7:33:08 PM UTC+8, Michael Hunger wrote: 
>>>> 
>>>> 
>>>> Please always include the information about the actual queries you run and 
>>>> the actual dataset information. 
>>>> Otherwise no one can help you 
>>>> 
>>>> 
>>>> and you should also include the profiling info of your two queries. 
>>>> 
>>>> 
>>>> Also try to measure query performance from the Neo4j-shell to see the 
>>>> least impact from drivers or additional requests in the neo4j-browser. 
>>>> You can also use: http://localhost:7474/webadmin/#/console/ for that query 
>>>> time testing. 
>>>> 
>>>> 
>>>> Michael 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> Am 16.03.2015 um 10:43 schrieb Ryan Velasco < ry...@ limesource.se >: 
>>>> 
>>>> 
>>>> Hello, 
>>>> 
>>>> 
>>>> If I run a query 2 it goes faster but if I run it with other queries the 
>>>> 2nd time, it is still slow. 
>>>> 
>>>> 
>>>> Best Regards, 
>>>> Ryan <Run with many queries.png> <Run the queries many times alone.png> 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "Neo4j" group. 
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to [email protected] . 
>>>> For more options, visit https://groups.google.com/d/optout . 
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> You received this message because you are subscribed to a topic in the 
>>>> Google Groups "Neo4j" group. 
>>>> To unsubscribe from this topic, visit 
>>>> https://groups.google.com/d/topic/neo4j/M8NVlEvjXKU/unsubscribe . 
>>>> To unsubscribe from this group and all its topics, send an email to 
>>>> [email protected] . 
>>>> For more options, visit https://groups.google.com/d/optout . 
>>>> <plan.png> 
>> 
> <explain.png><plan.png> 

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] How does query profiler work?

Reply via email to