Can you try this: profile MATCH (n:Topic), (m:Topic) where n.name = 'Topic1' and m.name = 'Topic2' MATCH p = (n)-[*0..2]-(m) return p, reduce(totProximity = 0, n IN relationships(p)| totProximity + n.proximity) AS pathProximity order by pathProximity DESC LIMIT 6
On Tue, Oct 14, 2014 at 9:06 AM, gg4u <[email protected]> wrote: > Hi Rodjer, > > thank you for your insights! > please see comments below: > > Il giorno lunedì 13 ottobre 2014 18:37:50 UTC+2, Rodger ha scritto: >> >> Hello, >> >> I've done a lot of RDBMS performance tuning. >> Just a few quick thoughts. >> >> >> Be sure to run the queries in the shell, if you are not already doing so. >> >> > Yes, they are run in the shell: > http://localhost:7474/webadmin/#/console/ > > >> How many rows are returned? Just sorting, then returning many rows, >> takes a long time to scroll them to output. >> >> >> > 9 rows > In the answer above, I wrote 9 paths > > > >> >> If you are getting duplicates, it may be the equivalent of a cartesian >> product, >> one of the worst things that can happen in RDBMS, and also one >> of the least known. See my presentation on them here: >> http://rodgersnotes.wordpress.com/2010/09/15/stamping-out- >> cartesian-products/ >> <http://www.google.com/url?q=http%3A%2F%2Frodgersnotes.wordpress.com%2F2010%2F09%2F15%2Fstamping-out-cartesian-products%2F&sa=D&sntz=1&usg=AFQjCNHJDOJ0IOsI6XRsg_9yuTscI4mqtQ> >> > > So I had a look at your pdf, > http://rodgersnotes.files.wordpress.com/2010/09/cartprodwordpress.pdf > page 11 > > and I think the idea you want to suggest, is to avoid duplicates (you > called them 'cartesian products') by enforcing conditions. > Though, since it is a graph db and not relational, not clear to me where > this applies because in the graph db I don't have 'jointed' queries between > tables, > so the conditions I have are, at least in my case, properties (index on > properties), and no-directional rels. > > >> >> >> Try: >> >> return p, count (*) >> order by count(*) >> > > I run: > > profile MATCH (n:Topic) , (m:Topic), p = (n)-[*0..2]-(m) where n.name = > 'Topic1' and m.name = 'Topic2' with p, n, m return p, count(*) order by > count(*); > > and I've got: (see there are also duplicates in paths: is it because I > have both (a)-[]->(b) and (a)<-[]-(b) ?) > > ==> > +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > ==> | p > > > | count(*) | > ==> > +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[71185298]{proximity:68},Node[1401899]{id:21375850,name:"Topic3"},:P_Topic_Link[71185313]{proximity:32},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[88675719]{proximity:28},Node[2594397]{id:31760062,name:"Topic4"},:P_Topic_Link[88675745]{proximity:23},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[30736000]{proximity:32},Node[2515502]{id: > 3106745,name:"Topic5"},:P_Topic_Link[30735974]{proximity:82},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[68206383]{proximity:72},Node[1202629]{id:19635605,name:"Topic6"},:P_Topic_Link[68206440]{proximity:32},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[98898173]{proximity:23},Node[3329750]{id:38567205,name:"Topic7"},:P_Topic_Link[98898126]{proximity:124},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[58107755]{proximity:55},Node[506613]{id:13841207,name:"Topic8"},:P_Topic_Link[58107766]{proximity:27},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[98898173]{proximity:23},Node[3329750]{id:38567205,name:"Topic7"},:P_Topic_Link[1025873]{proximity:124},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[5662626]{proximity:47},Node[736816]{id:157427,name:"Topic9"},:P_Topic_Link[5662565]{proximity:138},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> | > [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[5662626]{proximity:47},Node[736816]{id:157427,name:"Topic9"},:P_Topic_Link[1025864]{proximity:138},Node[1386672]{id:21245,name:"Topic2"}] > | 1 | > ==> > +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ > ==> 9 rows > ==> > ==> ColumnFilter(0) > ==> | > ==> +Sort > ==> | > ==> +EagerAggregation > ==> | > ==> +ColumnFilter(1) > ==> | > ==> +ExtractPath > ==> | > ==> +Filter > ==> | > ==> +TraversalMatcher > ==> > ==> > +------------------+---------+---------+-------------+----------------------------------------------------------------------------------+ > ==> | Operator | Rows | DbHits | Identifiers | > Other | > ==> > +------------------+---------+---------+-------------+----------------------------------------------------------------------------------+ > ==> | ColumnFilter(0) | 9 | 0 | | > keep columns p, count(*) | > ==> | Sort | 9 | 0 | | Cached( > INTERNAL_AGGREGATE931614f3-4def-4fc4-a80b-c6fca3839817 of type Integer) | > ==> | EagerAggregation | 9 | 0 | | > p | > ==> | ColumnFilter(1) | 9 | 0 | | > keep columns p, n, m | > ==> | ExtractPath | 9 | 0 | p | > | > ==> | Filter | 9 | 3032385 | | > (hasLabel(m:Topic(0)) AND Property(m,name(1)) == { AUTOSTRING1}) | > ==> | TraversalMatcher | 1010795 | 1024307 | | > m, UNNAMED36, m | > ==> > +------------------+---------+---------+-------------+----------------------------------------------------------------------------------+ > ==> > >> >> >> >> Without me looking at the raw data, and the query result, you >> seem to have many operations going on. So, you have a lot of rows in >> the profile output. >> > > Only 9 > > >> As a general rule, the more rows there are in the >> profile, the slower the response time is. >> ie. the more complex the query, the slower it is. >> >> >> If I were looking at this, I would try to isolate which part of >> the query is the slow part. The Return clause, or the Match clause? >> >> >> You've already tried the response times with the data. >> Try to simply: >> return count(*) . >> > > I run: > MATCH (n:Topic) , (m:Topic), p = (n)-[*0..2]-(m) where n.name = 'Topic1' > and m.name = 'Topic2' with p, n, m return p, count(*) order by count(*); > > and obtain 9 rows in 182799 ms > > I run: > MATCH (n:Topic), (m:Topic) where n.name = 'Topic1' and m.name = 'Topic2' > with n, m return count(*); > > and obtain 856ms > > > profile MATCH (n:Topic), (m:Topic) where n.name = 'Topic1' and m.name = > 'Topic2' with n, m return count(*); > > results in: > > > ==> ColumnFilter > ==> | > ==> +EagerAggregation > ==> | > ==> +SchemaIndex(0) > ==> | > ==> +SchemaIndex(1) > ==> > ==> > +------------------+------+--------+-------------+-------------------------------+ > ==> | Operator | Rows | DbHits | Identifiers | > Other | > ==> > +------------------+------+--------+-------------+-------------------------------+ > ==> | ColumnFilter | 1 | 0 | | keep > columns count(*) | > ==> | EagerAggregation | 1 | 0 | | > | > ==> | SchemaIndex(0) | 1 | 2 | m, m | { AUTOSTRING1}; > :Topic(name) | > ==> | SchemaIndex(1) | 1 | 2 | n, n | { AUTOSTRING0}; > :Topic(name) | > ==> > +------------------+------+--------+-------------+-------------------------------+ > > >> How many seconds response time is that, versus the original query? >> What is the resulting profile? >> >> >> > > So, it looks like it actually take huge time in traversing the graph, > while reasonable time '~900ms' to match a fullstring node. > > *Any idea for improving performance of traversal??* > > *It is a real problem, since also for getting results of first neighbors > of a node, I met the same problem which makes currently unfeasible for > production :* > *Anyone with real case of similar size graph and structure trying to > perform a similar query?* > > as example, this query to obtain first neighbors of node Topic44: > > MATCH (n:Topic) , (m), p = (n)-[*0..1]-(m) > where n.name = 'Topic44' > with p, n, m > return p, reduce(totProximity = 0, n IN relationships(p)| totProximity + > n.proximity) AS pathProximity order by pathProximity DESC LIMIT 6 > > returns > 6 rows in ~65000 ms VS 6 rows in less than a second with a NoSQL. > > Any idea? > > thank you guys for helping!! Hope to find a solution soon.. > > > > >> >> >> See also the tuning presentations I've done: >> http://rodgersnotes.wordpress.com/2010/09/14/oracle-performance-tuning/ >> <http://www.google.com/url?q=http%3A%2F%2Frodgersnotes.wordpress.com%2F2010%2F09%2F14%2Foracle-performance-tuning%2F&sa=D&sntz=1&usg=AFQjCNE0XK_XcNk5YBj806h6a1OJHr0glA> >> http://rodgersnotes.wordpress.com/2014/06/08/tuning-the- >> untunable-when-indexes-and-optimizer-dont-help-2/ >> <http://www.google.com/url?q=http%3A%2F%2Frodgersnotes.wordpress.com%2F2014%2F06%2F08%2Ftuning-the-untunable-when-indexes-and-optimizer-dont-help-2%2F&sa=D&sntz=1&usg=AFQjCNFgTfu5bnjPw6boHWttJpzQBtaNgw> >> They are quick reads. >> >> thank you, seen them, > they are about SQL tuning mostly: > I've just used neo4j strucutre to store a graph with same label on 4M > topics (I MUST keep it with one label), index on topic(name) property and > used cypher to query the db, > this is my data structure. > > I've put a number of principles and principles in there, that you might >> apply. >> ie. Could you create the NEO4J equivalent of a temp table? >> >> >> Hope this helps. >> >> >> On Thursday, October 9, 2014 2:41:47 AM UTC-5, gg4u wrote: >>> >>> Hi Micheal, thank you. >>> sure I post my profile result here below ! >>> >>> >>>> -- > You received this message because you are subscribed to the Google Groups > "Neo4j" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
