Re: [Neo4j] Traversing Large (weighted) graphs: performance, data structure, indexes

gg4u Wed, 15 Oct 2014 02:18:11 -0700

Hi Rodger,

thank you for your page on blog: useful for next steps (I see also 
extracting paths and nodes belonging to first, second and so on 
generations) !!


a quick note on what you suggested:
MATCH (*) Return count(*);
==> SyntaxException: Invalid input '*': expected whitespace, an identifier, 
node labels, a property map, ')' or 
org$neo4j$cypher$internal$compiler$v2_1$parser$Patterns$$PatternElement 
(line 1, column 8)

huh, maybe was accepted on older version ?
I can tell you that, roughly, there are up ~100 nodes rels to first 
generation nodes for each node 
In this data structure, a topic is unique. (There is only one 'Topic1')

Il giorno martedì 14 ottobre 2014 17:28:57 UTC+2, Rodger ha scritto:
>
> Hello again,
>
> Actually when I mentioned the shell, I was referring to 
> the character shell. On Linux, run with 
> ./neo4j-shell
>
>
> Although, if you are only getting 9 rows back, this shouldn't make much 
> difference. 
>
>
> -----
>
> Looking at these two queries:
>
>
> MATCH (n:Topic) , (m:Topic), p = (n)-[*0..2]-(m) 
> where n.name = 'Topic1' and m.name = 'Topic2' 
> with p, n, m 
> return p, count(*) 
> order by count(*);
>
>
> 9 rows in 182799 ms
>
> (3 minutes)
>
>
> MATCH (n:Topic), (m:Topic) 
> where n.name = 'Topic1' and m.name = 'Topic2' 
> with n, m 
> return count(*);
>
> 856ms
>
> (less than a second) 
>
>
> The critical part that slows things down seems to be: 
> p = (n)-[*0..2]-(m) 
>
>
> BTW, how many rows are returned with this simpler, faster query? 
>
>
>
> -----
>
>
> More elementary analysis I would typically do.
> If I didn't already know the answers. 
>
>
> How many nodes are in the whole dataset? 
>
>
> MATCH (*)  
> return count(*) 
>
>
>
> Do the Topics dominate the dataset?
> Or, are they just a small percent of the nodes?
>
>
> Find the distinct set of Labels, including Topic. 
>
> MATCH (x)
> RETURN labels(x), count(*)
> order by count(*)
>
>
>
> Find the distinct set of Topics:
>
> MATCH (n:Topic)  
> return n.name , count(*) 
> order by count(*)
>
>
> Do you get many nodes for Topic 1 and 2, but find only 9 paths between 
> them?
> (more work)
>
> Or, just a few nodes for topic 1 and 2? (above)
> (less work)
>
>
> -----------
>
>
> I'm also wondering if you are getting multiple paths between
> the same two nodes, thus the duplicates. 
>
> See a post I did last year on this subject:
>
> Counting Many Paths Between Nodes In NEO4J
>
> http://rodgersnotes.wordpress.com/2013/08/16/counting-many-paths-between-nodes-in-neo4j/
>
> See the part:
> List Every Distinct Path Between Two Specific Nodes:
>
>
> In which case, you might want to look at shortestPath() or 
> allShortestPaths (). 
>  
>
> -----
>
> Just some thoughts.  
>
> Hope it's useful and not too elementary. 
>
>
> On Tuesday, October 14, 2014 2:06:06 AM UTC-5, gg4u wrote:
>>
>> Hi Rodjer,
>>
>> thank you for your insights!
>> please see comments below:
>>
>>>
>>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Traversing Large (weighted) graphs: performance, data structure, indexes

Reply via email to