I am working with an acyclic, directed graph (an ontology) that models
human health and am needing to identify certain diseases (example:
Pneumonia) that are infectious but NOT caused by certain bacteria (staph or
streptococcus). All concepts are Nodes defined as ObjectConcepts.
ObjectConcepts are connected by relationships such as [ISA],
[Pathological_process], [Causative_agent], etc.
The query requires:
a) Identification of all concepts subsumed by the concept Pneumonia as
follows:
MATCH p = (a:ObjectConcept{Pneumonia}) <-[:ISA*]- (b:ObjectConcept)
b) Identification of all concepts subsumed by Genus Staph and Genus Strep
(including the concept Genus Staph and Genus Strep) as follows. Note:
with b MATCH (b) q = (c:ObjectConcept{Strep})<-[:ISA*]-(d:ObjectConcept), h
= (e:ObjectConcept{Staph})<-[:ISA*]-(f:ObjectConcept)
c) Identify all nodes(p) that do not have a causative agent of Strep (i.e.,
nodes(q)) or Staph (nodes(h)) as follows:
with b,c,d,e,f MATCH (b),(c),(d),(e),(f) WHERE (b)--()-->(c) OR
(b)-->()-->(d) OR (b)-->()-->(e) OR (b)-->()-->(f) RETURN distinct b.Name;
The query returns the correct result, but runs for 20 min. However,
running the same query without Strep or Staph concepts, the query returns a
correct result in < 0.5 seconds. XOR operators also work when and/or
results are needed, but speed is still an issue.
I am new to cypher, but I am sure that there is a more efficient query for
this problem. I am unsure if/how collections and lists could be employed
and improve run times. Suggestions?
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.