Profile for the last query:
profile MATCH p = (n:Topic)-[*..2]-(m:Topic) where n.name = 'Topic66' and
m.name = 'Topic111' with p, n, m return p, reduce(totProximity = 0, n IN
relationships(p)| totProximity + n.proximity) AS pathProximity order by
pathProximity;
==> 2411 rows
==>
==> ColumnFilter(0)
==> |
==> +Sort
==> |
==> +Extract
==> |
==> +ColumnFilter(1)
==> |
==> +ExtractPath
==> |
==> +Filter
==> |
==> +TraversalMatcher
==>
==>
+------------------+---------+---------+-------------+-------------------------------------------------------------------+
==> | Operator | Rows | DbHits | Identifiers |
Other |
==>
+------------------+---------+---------+-------------+-------------------------------------------------------------------+
==> | ColumnFilter(0) | 2411 | 0 | |
keep columns p, pathProximity |
==> | Sort | 2411 | 0 | |
Cached(pathProximity of type Any) |
==> | Extract | 2411 | *9640* | |
pathProximity |
==> | ColumnFilter(1) | 2411 | 0 | |
keep columns p, n, m |
==> | ExtractPath | 2411 | 0 | p |
|
==> | Filter | 2411 | 4910094 | |
(hasLabel(m:Topic(0)) AND Property(m,name(1)) == { AUTOSTRING1}) |
==> | TraversalMatcher | 1636698 | 1681810 | |
m, UNNAMED19, m |
==>
+------------------+---------+---------+-------------+-------------------------------------------------------------------+
Il giorno giovedì 16 ottobre 2014 00:01:33 UTC+2, gg4u ha scritto:
>
> Sure, I tried three examples with (n), (n:Topic) and allShortestPath() and
> also profiling them:
>
> 1.
>
> *MATCH p = (n:Topic)-[*0..2]-(m:Topic) where n.name <http://n.name> =
> 'Topic1' and m.name <http://m.name> = 'Topic2' return p,
> reduce(totProximity = 0, n IN relationships(p)| totProximity + n.proximity)
> AS pathProximity order by pathProximity DESC LIMIT 6;*
>
> ==> |
> [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[5662626]{proximity:47},Node[736816]{id:157427,name:"Topic3"},:P_Topic_Link[5662565]{proximity:138},Node[1386672]{id:21245,name:"Topic2"}]
>
> | 185
> ==> |
> [Node[103105]{id:1092923,name:"Topic1"},:P_Topic_Link[5662626]{proximity:47},Node[736816]{id:157427,name:"Topic3"},:P_Topic_Link[1025864]{proximity:138},Node[1386672]{id:21245,name:"Topic2"}]
>
> | 185 |
>
> ...
>
>
> *==> 6 rows*
> *==> 162423 ms*
>
>
> *profile* MATCH p = (n:Topic)-[*0..2]-(m:Topic) where n.name =
> 'Topic1' and m.name = 'Topic2' return p, reduce(totProximity = 0, n IN
> relationships(p)| totProximity + n.proximity) AS pathProximity order by
> pathProximity DESC LIMIT 6;
>
> ==> 6 rows
> ==>
> ==> ColumnFilter
> ==> |
> ==> +Top
> ==> |
> ==> +Extract
> ==> |
> ==> +ExtractPath
> ==> |
> ==> +Filter
> ==> |
> ==> +TraversalMatcher
> ==>
> ==>
> +------------------+---------+---------+-------------+-------------------------------------------------------------------+
> ==> | Operator | Rows | DbHits | Identifiers |
> Other |
> ==>
> +------------------+---------+---------+-------------+-------------------------------------------------------------------+
> ==> | ColumnFilter | 6 | 0 | |
> keep columns p, pathProximity |
> ==> | Top | 6 | 0 | |
> { AUTOINT3};* Cached(pathProximity of type Any) *|
> ==> | Extract | 9 | 36 | |
> pathProximity |
> ==> | ExtractPath | 9 | 0 | p |
> |
> ==> | Filter | 9 | 3032385 | |
> (hasLabel(m:Topic(0)) AND Property(m,name(1)) == { AUTOSTRING1}) |
> ==> | TraversalMatcher | 1010795 | 1024307 | |
> m, UNNAMED20, m |
> ==>
> +------------------+---------+---------+-------------+-------------------------------------------------------------------+
> ==>
>
>
> MATCH p = *allShortestPaths*((n:Topic)-[*..2]-(m:Topic)) where n.name =
> 'Topic1' and m.name = 'Topic2' with p, n, m return p, reduce(totProximity
> = 0, n IN relationships(p)| totProximity + n.proximity) AS pathProximity
> order by pathProximity;
>
> ==> 9 rows
> *==> 10111 ms*
>
>
> ==> 9 rows
> ==>
> ==> ColumnFilter
> ==> |
> ==> +Sort
> ==> |
> ==> +Extract
> ==> |
> ==> +ShortestPath
> ==> |
> ==> +SchemaIndex(0)
> ==> |
> ==> +SchemaIndex(1)
> ==>
> ==>
> +----------------+------+--------+-------------+-----------------------------------+
> ==> | Operator | Rows | DbHits | Identifiers |
> Other |
> ==>
> +----------------+------+--------+-------------+-----------------------------------+
> ==> | ColumnFilter | 9 | 0 | | keep columns p,
> pathProximity |
> ==> | Sort | 9 | 0 | |*
> Cached(pathProximity of type Any)* |
> ==> | Extract | 9 | 36 | |
> pathProximity |
> ==> | ShortestPath | 9 | 0 | p |
> |
> ==> | SchemaIndex(0) | 1 | 2 | m, m | { AUTOSTRING1};
> :Topic(name) |
> ==> | SchemaIndex(1) | 1 | 2 | n, n | { AUTOSTRING0};
> :Topic(name) |
> ==>
> +----------------+------+--------+-------------+-----------------------------------+
>
>
> 2.
>
> MATCH p = (n:Topic)-[*0..2]-(m:Topic) where n.name = 'Topic44' and
> m.name = 'Topic2' return p, reduce(totProximity = 0, n IN
> relationships(p)| totProximity + n.proximity) AS pathProximity order by
> pathProximity DESC LIMIT 6;
>
> ==> 6 rows
> *==> 906108 ms*
>
>
>
> ==> 6 rows
> ==>
> ==> ColumnFilter
> ==> |
> ==> +Top
> ==> |
> ==> +Extract
> ==> |
> ==> +ExtractPath
> ==> |
> ==> +Filter
> ==> |
> ==> +TraversalMatcher
> ==>
> ==>
> +------------------+---------+---------+-------------+-------------------------------------------------------------------+
> ==> | Operator | Rows | DbHits | Identifiers |
> Other |
> ==>
> +------------------+---------+---------+-------------+-------------------------------------------------------------------+
> ==> | ColumnFilter | 6 | 0 | |
> keep columns p, pathProximity |
> ==> | Top | 6 | 0 | |
> { AUTOINT3}; Cached(pathProximity of type Any) |
> ==> | Extract | 67 | 268 | |
> pathProximity |
> ==> | ExtractPath | 67 | 0 | p |
> |
> ==> | Filter | 67 | 3246003 | |
> (hasLabel(m:Topic(0)) AND Property(m,name(1)) == { AUTOSTRING1}) |
> ==> | TraversalMatcher | 1082001 | 1097166 | |
> m, UNNAMED20, m |
> ==>
> +------------------+---------+---------+-------------+-------------------------------------------------------------------+
>
>
>
> MATCH p = *allShortestPaths*((n:Topic)-[*..2]-(m:Topic)) where n.name =
> 'Topic44' and m.name = 'Topic2' with p, n, m return p,
> reduce(totProximity = 0, n IN relationships(p)| totProximity + n.proximity)
> AS pathProximity order by pathProximity;
>
>
> magically and for first time:
> *146ms*
>
>
> so:
>
> profile MATCH p = *allShortestPaths*((n:Topic)-[*..2]-(m:Topic)) where
> n.name = 'Topic44' and m.name = 'Topic2' with p, n, m return p,
> reduce(totProximity = 0, n IN relationships(p)| totProximity + n.proximity)
> AS pathProximity order by pathProximity;
>
>
> ==> 67 rows
> ==>
> ==> ColumnFilter
> ==> |
> ==> +Sort
> ==> |
> ==> +Extract
> ==> |
> ==> +ShortestPath
> ==> |
> ==> +SchemaIndex(0)
> ==> |
> ==> +SchemaIndex(1)
> ==>
> ==>
> +----------------+------+--------+-------------+-----------------------------------+
> ==> | Operator | Rows | DbHits | Identifiers |
> Other |
> ==>
> +----------------+------+--------+-------------+-----------------------------------+
> ==> | ColumnFilter | 67 | 0 | | keep columns p,
> pathProximity |
> ==> | Sort | 67 | 0 | | Cached(pathProximity
> of type Any) |
> ==> | Extract | 67 | 268 | |
> pathProximity |
> ==> | ShortestPath | 67 | 0 | p |
> |
> ==> | SchemaIndex(0) | 1 | 2 | m, m | { AUTOSTRING1};
> :Topic(name) |
> ==> | SchemaIndex(1) | 1 | 2 | n, n | { AUTOSTRING0};
> :Topic(name) |
> ==>
> +----------------+------+--------+-------------+-----------------------------------+
> ==>
>
>
>
>
> 3.
> So I tried:
>
> MATCH p = *allShortestPaths*((n:Topic)-[*..2]-(m:Topic)) where n.name =
> 'Topic66' and m.name = 'Topic111' with p, n, m return p,
> reduce(totProximity = 0, n IN relationships(p)| totProximity + n.proximity)
> AS pathProximity order by pathProximity;
>
> 2 rows
> 34337 ms
>
> and
>
> MATCH p = (n:Topic)-[*..2]-(m:Topic) where n.name = 'Topic66' and m.name
> = 'Topic111' with p, n, m return p, reduce(totProximity = 0, n IN
> relationships(p)| totProximity + n.proximity) AS pathProximity order by
> pathProximity;
>
> *2411 rows*
> *3228423 ms !!*
>
> Please also note that for each row there is a duplicate
> (in my structure I do have (a:Topic)-[]->(b:Topic) and
> (b:Topic)-[]->(a:Topic), but I thought that (a:Topic)-[]-(b:Topic) would
> give unique results since paths are the same ... huh ?
> ...
> ==> |
> [Node[1103460]{id:18831,name:"Topic66"},:P_Topic_Link[68136903]{proximity:189},Node[1198508]{id:19594028,name:"Topic113"},:P_Topic_Link[68136874]{proximity:368},Node[1603710]{id:22939,name:"Topic111"}]
>
>
> | 557 |
> ==> |
> [Node[1103460]{id:18831,name:"Topic66"},:P_Topic_Link[68136903]{proximity:189},Node[1198508]{id:19594028,name:"Topic113"},:P_Topic_Link[1113182]{proximity:368},Node[1603710]{id:22939,name:"Topic111"}]
>
>
> | 557 |
>
>
>
>
> So I have that **allShortestPath()** gives faster time and **almost**
> wanted results **only** if previously searches were made (cached). May it
> be true?
> It d make sense partially: I expect graph algorithms faster than
> retrieving paths, but a time for retriving 67 rows of general paths cannot
> be that slow... (> 100 order of magnitude slower than allShortestPath() ?? )
>
> Would it make sense if post a script in python to generate a random
> structure similar to the one I have, post again the configurations files
> used for my server and batch-importer, post the header I used for loading
> the csv with the batch importer, and you could tell me if responsive time
> is less 1s (production time) ?
> you could try same tests and post results and a step by step guide ?
>
>
>
>
>
> Il giorno mercoledì 15 ottobre 2014 21:56:01 UTC+2, Michael Hunger ha
> scritto:
>
> Can you just try this please?
>
> MATCH p = (n:Topic)-[*0..2]-(m:Topic)
> where n.name = 'Topic1' and m.name = 'Topic2'
> return p, reduce(totProximity = 0, n IN relationships(p)| totProximity +
> n.proximity) AS pathProximity
> order by pathProximity DESC LIMIT 6;
>
>
>
> On Wed, Oct 15, 2014 at 2:52 PM, gg4u <[email protected]> wrote:
>
> Hi Michael,
>
> sorry I don't understand what it means.
> Can I help you in helping me sorting out the issue somehow? :)
>
> What could I check or correct ?
> What is a pattern matcher and can you teach in reading the profile for
> making your conclusion?
> Which may be possible reasons for selecting wrong pattern matcher, how to
> correct it?
>
> thank you
>
> Il giorno mercoledì 15 ottobre 2014 14:04:57 UTC+2, Michael Hunger ha
> scritto:
>
> Hi,
>
> from the profiling it seems that Cypher selects the wrong pattern matcher
> if we separate the node-lookup and path-match.
>
> profile
> MATCH p = (n:Topic)-[*0..2]-(m:Topic)
> where n.name = 'Topic1' and m.name = 'Topic2'
> return p, reduce(totProximity = 0, n IN relationships(p)| totProximity +
> n.proximity) AS pathProximity
> order by pathProximity DESC LIMIT 6;
>
>
> +------------------+------+--------+-------------+----------
> ---------------------------------------------------------+
> | Operator | Rows | DbHits | Identifiers |
> Other |
> +------------------+------+--------+-------------+----------
> ---------------------------------------------------------+
> | ColumnFilter | 0 | 0 | |
> keep columns p, pathProximity |
> | Top | 0 | 0 | | {
> AUTOINT3}; Cached(pathProximity of type Any) |
> | Extract | 0 | 0 | |
> pathProximity |
> | ExtractPath | 0 | 0 | p |
> |
> | Filter | 0 | 0 | | (hasLabel(m:Topic(0))
> AND Property(m,name(1)) == { AUTOSTRING1}) |
> | TraversalMatcher | 0 | 1 | |
> m, UNNAMED20, m |
> +------------------+------+--------+-------------+----------
> ---------------------------------------------------------+
>
> On Wed, Oct 15, 2014 at 11:00 AM, gg4u <[email protected]> wrote:
>
> Hi Micheal,
>
> your aggregation was only on the same paths, so you get 9 different paths
> but you didn't show the counts per path.
>
>
> not clear to me yet; I am gonna post results for each query you suggested
> to try out.
>
> Rodger, to summarize a description of this test:
> 4M nodes labeled 'Topic'
> 100M rels (weighted)
> Index on Topic(name) > 'is a string type property for each node'
> 'Topic' dominates all dataset and this will be a subgraph of a larger
> network (if we I can set this in production time, a next step will have a
> graph of 85M nodes, ~2B rels, with same type of structure putting
> properties as nodes' properties and not decoupling to other nodes). So this
> is a primary, real case test, to see if it is feasible using Neo4j
> datastructure Vs NoSQL.
> And I'd love the answer be yes :D
>
> Micheal, here another test with other topics (I think not cached):
>
> MATCH (n:Topic) , (m:Topic), p = (n)-[*0..2]-(m) where n.name = '
> *Topic100*' and m.name = '*Topic2*' with p, n, m return p, count(*) order
> by count(*);
>
> results:
> ==> +-----------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ---------------+
> ==> | p
>
>
> | count(*) |
> ==> +-----------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ---------------+
> ==> | [Node[4114904]{id:7955,name:"Topic100"},:P_Topic_Link[
> 10618620]{proximity:90},Node[3528892]{id:411782,name:"
> Topic101"},:P_Topic_Link[1025954]{proximity:68},Node[
> 1386672]{id:21245,name:"Topic2"}]
> | 1 |
> ==> | [Node[4114904]{id:7955,name:"Topic100"},:P_Topic_Link[
> 2424845]{proximity:91},Node[3719110]{id:52502,name:"
> Topic102"},:P_Topic_Link[1025923]{proximity:85},Node[
> 1386672]{id:21245,name:"Topic2"}] | 1 |
> ==> | [Node[4114904]{id:7955,name:"Topic100"},:P_Topic_Link[
> 100682940]{proximity:19},Node[3461206]{id:39782569,name:"
> Topic103"},:P_Topic_Link[100682931]{proximity:107},
> Node[1386672]{id:21245,name:"Topic2"}] | 1 |
> ==> | [Node[4114904]{id:7955,name:"Topic100"},:P_Topic_Link[
> 21653222]{proximity:82},Node[706102]{id:1551073,name:"
> Topic104"},:P_Topic_Link[21653218]{proximity:87},Node[
> 1386672]{id:21245,name:"Topic2"}] | 1
> |
>
> (.... results ...)
>
> ==> +-----------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ------------------------------------------------------------
> ---------------+
> ==> *67 rows*
> ==>* 3900775 ms*
>
>
>
> Il giorno martedì 14 ottobre 2014 22:54:43 UTC+2, Michael Hunger ha
> scritto:
>
> How many rows does this return?
>
> MATCH (n:Topic) , (m:Topic), p = (n)-[*0..2]-(m) where n.name = 'Topic1'
> and m.name = 'Topic2' with p, n, m return p, count(*) order by count(*);
>
> your aggregation was only on the same paths, so you get 9 different paths
> but you didn't show the counts per path.
>
>
>
>
> and obtain 9 rows in 182799 ms
>
> On Tue, Oct 14, 2014 at 10:59 AM, gg4u <[email protected]> wrote:
>
> Yes:
>
> neo4j-sh (?)$ profile MATCH (n:Topic), (m:Topic) where n.name = 'Topic1'
> and m.name = 'Topic2' MATCH p = (n)-[*0..2]-(m) return p,
> reduce(totProximity = 0, n IN relationships(p)| totProximity + n.proximity)
> AS pathProximity order by pathProximity DESC LIMIT 6;
> ==>
> [...results...]
> ==> 6 rows
> ==>
> ==> ColumnFilter
> ==> |
> ==> +Top
> ==> |
> ==> +Extract
> ==> |
> ==> +ExtractPath
> ==> |
> ==> +PatternMatcher
> ==> |
> ==> +SchemaIndex(0)
> ==> |
> ==> +SchemaIndex(1)
> ==>
> ==> +----------------+------+--------+-------------------+------
> -------------------------------------------+
> ==> | Operator | Rows | DbHits | Identifiers |
> Other |
> ==> +----------------+------+--------+-------------------+------
> -------------------------------------------+
> ==> | ColumnFilter | 6 | 0 | |
> keep columns p, pathProximity |
> ==> | Top | 6 | 0 | | { AUTOINT3};
> Cached(pathProximity of type Any) |
> ==> | Extract | 9 | 36 | |
> pathProximity |
> ==> | ExtractPath | 9 | 0 | p |
> |
> ==> | PatternMatcher | 9 | 0 | n, m, UNNAMED94 |
> |
> ==> | SchemaIndex(0) | 1 | 2 | m, m |
> { AUTOSTRING1}; :Topic(name) |
> ==> | SchemaIndex(1) | 1 | 2 | n, n |
> { AUTOSTRING0}; :Topic(name) |
> ==> +----------------+------+--------+-------------------+------
> -------------------------------------------+
> ==>
> neo4j-sh (?)$
>
>
>
> Il giorno martedì 14 ottobre 2014 10:00:29 UTC+2, Michael Hunger ha
> scritto:
>
> Can you try this:
>
> profile
> MATCH (n:Topic), (m:Topic)
> where n.name = 'Topic1' and m.name = 'Topic2'
> MATCH p = (n)-[*0..2]-(m)
> return p, reduce(totProximity = 0, n IN relationships(p)| totProximity +
> n.proximity) AS pathProximity
> order by pathProximity DESC
> LIMIT 6
>
>
> </s
>
> ...
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.