I am fairly new to programming and this is my first time using graph
databases, Cypher and Neo4J, I am learning as I go, testing to see if each
stage is a viable route to final development and trying to gain enough of a
basic understanding of each element needed for the application, so I can
hire and communicate with a full time team, as well as be able to do grunt
work when needed, rather than be the entrepreneur who has no clue about
what is happening and just expects things to happen. Any assistance would
be greatly appreciated.
I am trying to create a database which will allow users with similar
profiles to match. They have answered questions and have been able to
create the nodes that would represent each profile possibility by assigning
a numerical value to each answer, so I have.
:Profile
quA: 1, quB: 1,quC: 1, quD: 1, quE: 1, quF: 1, quG: 1, quH: 1, quI: 1, quJ:
1
....
all the way to
....
quA: 5, quB: 5,quC: 5, quD: 5, quE: 5, quF: 5, quG: 3, quH: 3, quI: 2, quJ:
2
where each numerical value is stored as an integer, this has resulted in
562500 nodes imported by CSV this created a 515Mb database. I have also
concatenated the answers to create a unique ID for each node so that I can
run the following query.
MATCH (a1:Profile), (b1:Profile)
WHERE a1.profileID < b1.profileId AND a1.quA = b1.quA AND a1.quB = b1.quB
AND a1.quC = b1.quC AND a1.quD = b1.quD AND a1.quE = b1.quE AND a1.quF =
b1.quF AND a1.quG = b1.quG
CREATE UNIQUE (a1)-[:SIMILAR {strength: 7} ]->(b1)
and so on so that I have every combination of 7 parameters matching up to 9
parameters matching. I know that will eventually create 175 relationships
per node so a massive total of 98,437,500 relationships.
Have set this up in a docker container on a google compute 8core 52Gb (the
max on the free trial option), with a 65500MB heap size, (based on the
calculator).
I am trying to find out if there is a more efficient way to create these
relationships, as on this setup, I have tried running the 1st query,
above), it has currently taken over 5 hours and has not finished, . Can
anyone suggest a better query or workflow to create such a large number of
relationships? The last thing I want to do is try and create individual
relationships and input them, unless someone can suggest a way of doing
this via a script and to send the queries via json.
Regards
Dave
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.