Re: [Neo4j] Running a match on multiple parameters

'Michael Hunger' via Neo4j Fri, 07 Apr 2017 11:16:20 -0700

I just wanted to see the profile output

And wanted to hear from you which of the comparisons is most selective so that 
we could enforce an index there.


Why the check on a smaller profile id?



Von meinem iPhone gesendet

> Am 23.03.2017 um 22:30 schrieb Michael Hunger 
> <[email protected]>:
> 
> Hi Dave,
> 
> would be good to look at a sample first of all:
> 
> you should create about 10k-100k relationships per transaction.
> 
> For "joining" nodes which is not an optimized graph operation, you should 
> have at least the very selective properties to be indexed.
> 
> Before running the queries I suggest to use EXPLAIN / PROFILE
> 
> e.g.
> 
> MATCH (a1:Profile), (b1:Profile)
> WHERE a1.profileID < b1.profileId AND a1.quA = b1.quA AND a1.quB = b1.quB AND 
> a1.quC = b1.quC AND a1.quD = b1.quD AND a1.quE = b1.quE AND a1.quF = b1.quF 
> AND a1.quG = b1.quG
> CREATE UNIQUE (a1)-[:SIMILAR  {strength: 7} ]->(b1)
> 
> PROFILE / EXPLAIN
> MATCH (a1:Profile)
> WITH a1 LIMIT 1000 // sample
> MATCH (b1:Profile)
> WHERE a1.profileID < b1.profileId AND a1.quA = b1.quA AND a1.quB = b1.quB AND 
> a1.quC = b1.quC AND a1.quD = b1.quD AND a1.quE = b1.quE AND a1.quF = b1.quF 
> AND a1.quG = b1.quG
> MERGE (a1)-[rel:SIMILAR]-(b1) ON CREATE SET rel.strength = 7
> 
> you should at least see one index lookup for b1 best if it was the most 
> selective property.
> 
> Michael
> 
> 
>> On Thu, Mar 23, 2017 at 3:35 PM, Dave Clissold <[email protected]> 
>> wrote:
>> I am fairly new to programming and this is my first time using graph 
>> databases, Cypher and Neo4J, I am learning as I go, testing to see if each 
>> stage is a viable route to final development and trying to gain enough of a 
>> basic understanding of each element needed for the application,  so I can 
>> hire and communicate with a full time team, as well as be able to do grunt 
>> work when needed, rather than be the entrepreneur who has no clue about what 
>> is happening and just expects things to happen. Any assistance would be 
>> greatly appreciated.
>> 
>> I am trying to create a database which will allow users with similar 
>> profiles to match.  They have answered questions and have been able to 
>> create the nodes that would represent each profile possibility by assigning 
>> a numerical value to each answer, so I have.
>> 
>> :Profile
>> quA: 1, quB: 1,quC: 1, quD: 1, quE: 1, quF: 1, quG: 1, quH: 1, quI: 1, quJ: 1
>> ....
>> all the way to
>> ....
>> quA: 5, quB: 5,quC: 5, quD: 5, quE: 5, quF: 5, quG: 3, quH: 3, quI: 2, quJ: 2
>> 
>> where each numerical value is stored as an integer, this has resulted in 
>> 562500 nodes imported by CSV this created a 515Mb database. I have also 
>> concatenated the answers to create a unique ID for each node so that I can 
>> run the following query.
>> 
>> MATCH (a1:Profile), (b1:Profile)
>> WHERE a1.profileID < b1.profileId AND a1.quA = b1.quA AND a1.quB = b1.quB 
>> AND a1.quC = b1.quC AND a1.quD = b1.quD AND a1.quE = b1.quE AND a1.quF = 
>> b1.quF AND a1.quG = b1.quG
>> CREATE UNIQUE (a1)-[:SIMILAR  {strength: 7} ]->(b1)
>> 
>> 
>> 
>> and so on so that I have every combination of 7 parameters matching up to 9 
>> parameters matching. I know that will eventually create 175 relationships 
>> per node so a massive total of 98,437,500 relationships.
>> 
>> 
>> 
>> Have set this up in a docker container on a google compute 8core 52Gb (the 
>> max on the free trial option), with a 65500MB heap size, (based on the 
>> calculator).
>> 
>> I am trying to find out if there is a more efficient way to create these 
>> relationships, as on this setup, I have tried running the 1st query, above), 
>> it has currently taken over 5 hours and has not finished, .  Can anyone 
>> suggest a better query or workflow to create such a large number of 
>> relationships?  The last thing I want to do is try and create individual 
>> relationships and input them, unless someone can suggest a way of doing this 
>> via a script and to send the queries via json.
>> 
>> Regards
>> 
>> 
>> 
>> Dave
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Running a match on multiple parameters

Reply via email to