Re: [Neo4j] Optimizing ShortestPath query for a better performances

idor Sun, 26 Jun 2016 03:30:39 -0700

Hi Michael,
I did experiments with your suggested query. with shortest path I got much 
lesser DB hits.



my query (with shortest path)

PROFILE MATCH p=shortestPath((user:User{userId:'userId123'})-[r*1..3]-(f:
User))
        WHERE f <> user 
        RETURN *

result:

<https://lh3.googleusercontent.com/-ds4qwKmZYCI/V2-uIXvSMpI/AAAAAAAAOYE/g-mgIM3ZZ4MQuai64aMhIIP5g_ouxSgvQCLcB/s1600/Screen%2BShot%2B2016-06-26%2Bat%2B1.28.37%2BPM.png>

and using your query (return let's of paths with the same nodes which was 
redundant)

PROFILE MATCH (user:User{userId:'userId123'})
 MATCH p=(user)-[r*1..3]-(friend:User)
        WHERE friend <> user 
        RETURN friend.userId as userId, reduce(base = '', rel in r | base + ' 
' + rel.dist) as dist


result:

<https://lh3.googleusercontent.com/-GQY7WeqhW1s/V2-uo6YnDnI/AAAAAAAAOYM/KyaRtb4qAmY862yR-bDw-wWhytKWpPrcwCLcB/s1600/Screen%2BShot%2B2016-06-26%2Bat%2B1.29.38%2BPM.png>


On Sunday, June 26, 2016 at 12:46:43 AM UTC+3, Michael Hunger wrote:
>
>
>
> On Sat, Jun 25, 2016 at 10:23 PM, idor <[email protected] <javascript:>> 
> wrote:
>
>> Hi Michael,
>> Thanks for replying. 
>>
>> scale it out on a scluster.
>>
>>
>> It requiresNeo4j entreprise license isnt it? 
>>
>
> It will require neo4j enterprise. Which license depends on your use-case 
> and company.
>  
>
>>
>> Why do you string-concatinate the dist property?
>>>
>>
>> On each relation we have a numeric property that we want to aggregate 
>> later on. if A connected to D this way (A->B->C>D) we need the properties 
>> of each relation between two nodes within 3  hops (A->B,B->C,C->D).
>>
>> The expected result would be:
>> B,C,D and the relation between. 
>>
>
> I don't understand what you're saying. 
>
>>
>>
>> But you also know that this can potentially return a lot of data? e.g. of 
>>> you have 100 friends on average this returns 100^3 aka 1M results !
>>
>>
>> The max "friends" of each node will be ~10. not more. 
>>
>> for what you do shortest Path is not the right solution
>>
>>  
>> How this query will overcome ciruclar iterations ? let's assume following 
>> relation ships.. 
>>
>
> Each relationship will appear only once in a path. So no circles.
>
> If you mean that there are multiple paths between your start node and it's 
> 3rd degree neighbors, sure that can happen.
>
> You, could either aggregate on the end node and select one of the paths.
>
> In APOC procedures there is expandPath operation 
> <https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/master/src/main/java/apoc/path/PathExplorer.java#L41>which
>  
> allows you to define the uniqueness of nodes and relationships. There you 
> could define node-global as uniqueness.
>
>  
>
>>
>> A->B
>> B->C->D 
>> B->F->D
>> B->H->D
>> ...
>> now if I apply it on node A. expected results is (B,C,D,F,H). but as u 
>> can see there are many *paths* to get into D. I might have lots of paths 
>> between one source to another(within 3 hops) till I iterate on all relevant 
>> nodes. 
>> Wouldnt you query fall for this with performances?
>>
>> Thanks.
>>
>>
>>
>>
>>
>> On Saturday, June 25, 2016 at 10:46:37 PM UTC+3, Michael Hunger wrote:
>>>
>>> Hi,
>>>
>>> for what you do shortest Path is not the right solution, as you don't 
>>> wan the path between two nodes but the neighborhood of one node.
>>>
>>> Also having that many connections when you don't have the CPUs to 
>>> process it doesn't make sense, scale it out on a scluster.
>>>
>>> Why do you string-concatinate the dist property?
>>>
>>> I presume you have an index / constraint on :User(userId) ?
>>>
>>> Try this instead:
>>>
>>>
>>> MATCH (user:User{userId:<someUser>})
>>>  MATCH 
>>> p=(user)-[r:relation_type1|relation_type2|relation_3*1..3]-(friend:User)
>>>         WHERE friend <> user 
>>>         RETURN friend.userId as userId, reduce(base = '', rel in r | 
>>> base + ' ' + rel.dist) as dist 
>>>
>>> But you also know that this can potentially return a lot of data? e.g. 
>>> of you have 100 friends on average this returns 100^3 aka 1M results !
>>>
>>> Michael
>>>
>>>
>>> On Sat, Jun 25, 2016 at 10:36 AM, idor <[email protected]> wrote:
>>>
>>>> I have in my graph ~1M nodes and ~1M relations.
>>>>
>>>> My motivation for query is to retrieve all related nodes(by 3 hops) of 
>>>> specific source node + aggregate properties within the response.
>>>>
>>>> I understood the shortestPath taking your sourceNode and query all 
>>>> other nodes(not necessarily related ones) in the graph to match the 
>>>> relevant results.
>>>> My latency is around 2-4 seconds when I have load (~4000 concurrent 
>>>> connections). it's too high. 
>>>>
>>>> Any idea how could I optimize my query for better performances?
>>>>
>>>>
>>>>  MATCH 
>>>> p=shortestPath((user:User{userId:<someUser>})-[r:relation_type1|relation_type2|relation_3*1..3]-(user:User))
>>>>         WHERE f <> user 
>>>>         RETURN (f.userId) as userId,
>>>>         reduce(base = '', rel in r | base + ' ' + rel.dist) as dist 
>>>>
>>>>
>>>> * notes: userId and relation_type1/2/3 are auto indexed
>>>>
>>>> * I am using the rest client and recently thought to upgrade into the 
>>>> new BOLT driver(not sure it will helps)
>>>>
>>>> Thank you.
>>>>
>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Neo4j" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Optimizing ShortestPath query for a better performances

Reply via email to