for purpose of comparison between SQL queries and cypher queries for my
researches I have the same relational database (with 50.000 relationships)
which I converted to neo4j database and I want to execute same queries for
relational and graph database and then optimize execution time by replacing
SQL queries by cypher ones my database has entityclasses where each two
entityclasses are related by a relationshipclasse I want to execute join
queries(in case of SQL) and cypher queries with optional match clause to
get the same results in a relational database we can start selecting data
from a relation table or a class table or a relation table in the case of
neo4j corresponds to a relationship that's why i have two types of queries
as follow :
CALL apoc.index.relationships('relationshipclazz0','att0:*') YIELD rel as R0
OPTIONAL MATCH (N:entityclazz0)<-[R0]-(N0:entityclazz1) OPTIONAL MATCH
(N0:entityclazz1)-[R1:relationshipclazz0]-() WITH distinct R0, R0.att0 as
AR0att0, count(R1.att1) as AR1att1, R1.att1 as BR1att1 ORDER BY AR1att1
desc,BR1att1,ID(R0), AR0att0 WITH ID(R0) as i, R0.att0 as O1,
head(collect(BR1att1)) as O2, R0 RETURN O1, O2, count(i) ORDER BY O1, O2
the first query I started from a relationship the second query in which I
started from a entityclazz node is as follow:
CALL apoc.index.nodes('entityclazz1','att0:*') YIELD node as N0 OPTIONAL MATCH
(N0)-[R0:relationshipclazz0]-() OPTIONAL MATCH (N0)-[R1:relationshipclazz0]-()
OPTIONAL MATCH ()-[R2:relationshipclazz0]-(N3:entityclazz1) WITH distinct N0,
N0.att0 as AN0att0, count(R0.att3) as AR0att3, R0.att3 as BR0att3,
count(N3.att1) as AN3att1, N3.att1 as BN3att1 ORDER BY AN3att1
desc,BN3att1,AR0att3 desc,BR0att3,ID(N0), AN0att0 WITH ID(N0) as i, N0.att0
as O1, head(collect(BR0att3)) as O2, head(collect(BN3att1)) as O3, N0 RETURN
O1, O2, O3, count(i) ORDER BY O1, O2, O3
despite I use node and relationship index using APOC procedures but these
queries take many time to get result 2175417 ms for the second query. so I
have more than two optional match in my queries the query slow down and it
give result after a long time or I'm obliged to separate my query by
optional match because if I make just one path in a single match query I
will get only the result of the last node or relationship that I put in the
path and not all the traversed nodes and relationship so with optional math
I can store the result of the match executed before + the result of the
optional much
for example if I execute this query:
CALL apoc.index.nodes('entityclazz1','att0:*') YIELD node as N0 OPTIONAL MATCH
(N0:entityclazz0)<-[R0:relationshipclazz0]-() WITH distinct N0, N0.att0 as
AN0att0, count(R0.att1) as AR0att1, R0.att1 as BR0att1 order by AR0att1
desc,BR0att1,ID(N0), AN0att0 WITH ID(N0) as i, N0.att0 as O1,
head(collect(BR0att1)) as O2, N0 RETURN O1, O2, count(i) ORDER BY O1, O2
the result is
O1 O2 count(i)0 0 60 1 20 2 300 3 220 null
1201 0 21 2 31 3 31 null 32
but my problem that I have to optimize the time of the query I use NEO4J
3.1.0
<https://lh3.googleusercontent.com/-nJJwcmWvlL0/WNgure4xN1I/AAAAAAAAAXM/m2h6TUI4xjsoLBd01KyNMIP3PLyt-gOhwCLcB/s1600/meta_graph.png>
this is the property_graph "meta_graph" of my graph database in which I
have 5 node labels and 4 relationship types. each node label correspond to
an entity class table in relational database ans each relationship type
correspond to a relation "join" table in relational database
<https://i.stack.imgur.com/302LU.png>
now I want to find links between each attribute X.A and its parents X.K.B
with k is a path which relate the attribute X.A and its parents X.B because
I work in domain of probabilistic graphical models.
I will take a real example to understand more seeing the following pistures
in which I have thre node labels professor,course and student and two
relationships types takes and teachs and properties "attributes" are in
cercles. red arcs present dependencies relationships between properties
<https://i.stack.imgur.com/YDdmD.jpg>
<https://lh3.googleusercontent.com/-y9yayz4J7S8/WNgwWXBDRfI/AAAAAAAAAXU/FOi1JKDK3_cS8jUrLNOn_XfBCdjp_esYQCLcB/s1600/17474890_668100610060222_1222690006_n.jpg>
for example here student.grade depend of student.intelligence and depends
of course.difficulty.here to find the probability of this dependency I need
to perform counts over each attribute and its parents. if we take the samae
example I need to make count of
(take.grade=A,take.course.difficulty=low,take.student.intelligence=high)
for this instance me query will return how many times in my database I have
this combination "observation".So in the case of relational database we
need to make join between relation and class tables using foreign keys to
be able to navigate beteen table to reach each attribute in its class or
relation table. Now in my case I have a generic form of databases that I
work with. in wich I have entityclasses which are related by
relationshipclasses.entity and relationship classes have attributes
att0,att1,att2,... and each attribute "property" has a doamin of value for
example the domain value of entityclazz0.att0 is [0,1,2] and the domain
value of entityclazz1.att1 is [0,1], etc.
to make counts this correspond in sql queries to make select queries in
which I can have multiple joins using foreigns keys so my idea is to
replace this joins by cypher query in which I use math and optional match
clause to make my count of properties "attributes". example of queries are
given in my previously in my first question and my comments
in the sace of relational database I can start selecting attributes from a
relation "join" table or I can start selecting attributes from an entity
class table. so that in my cypher queries sometimes I have match
()-[relationshipclazz]-() then a series of optional match and sometime I
have match (entityclazz) then a series of optional match. like its
mentionad in my first question then because I used APOC procedure I just
replaced the first matches clause by respectively
*CALL apoc.index.relationships('relationshipclazz0','att0:*') YIELD rel as
R0*
and
*CALL apoc.index.nodes('entityclazz1','att0:*') YIELD node as N0*
pleaaaase help me if you have idea because it is urgent I want to optimize
my queries because in a big graph database with just 60.000 relationships
and nodes these type of queries take a lot of time specialy if I have
multiple optional match( which correspond to multiple joins in my sql
query) so if I have a big level of joins I translate it by multiple
optional mutch or this is take a lot of time in a cypher query
I can also give you more details if you need more. thanks in advance
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.