Re: [Neo4j] Re: Creating efficient multiple match queries?

Michael Hunger Sun, 22 Mar 2015 19:16:12 -0700

Hi Scott,

What Neo4j version are you running this on?
Can you share your database with my privately?



In your original query you had a "WHERE pattern" which can be transformed to a 
match.

here you have a WHERE NOT pattern where that is not so easy

The least I would change is to check "c" upfront before you expand the second 
path.

MATCH (c:ObjectConcept{sctid:58800005}),  (a:ObjectConcept{sctid:233604007})
MATCH (a) <-[:ISA*]-(b:ObjectConcept)
WHERE NOT (b)-->()--(c)
WITH distinct b,c
MATCH (c)<-[:ISA*]-(d:ObjectConcept) 
WHERE NOT (b)-->()-->(d) 
RETURN distinct b 

optionally try to reduce the cardinalities even more:

MATCH (c:ObjectConcept{sctid:58800005}),  (a:ObjectConcept{sctid:233604007})
MATCH (a) <-[:ISA*]-(b:ObjectConcept)
// toggle this
WITH distinct b,c
WHERE NOT (b)-->()--(c)
WITH distinct b,c
MATCH (c)<-[:ISA*]-(d:ObjectConcept) 
// toggle this
WITH distinct b,d
WHERE NOT (b)-->()-->(d) 
RETURN distinct b 

Michael

> Am 23.03.2015 um 00:37 schrieb Scott Campbell <[email protected]>:
> 
> Thanks, Michael.  
> 
> Yes, the nodes ObjectConcept are indexed.  Yes, there was a direction missing 
> in the original query, a typo on my part.  
> 
> All ObjectConcept nodes have an ISA relationship to their supertypes (a 
> polyhierarchy).  Also, ObjectConcepts have defining relationships with other 
> ObjectConcepts as necessary to disambiguate on ObjectConcept from another.
> 
> 
> I did try this query:
> 
> MATCH p = (a:ObjectConcept{sctid:233604007}) <-[:ISA*]- (b:ObjectConcept), 
> q=(c:ObjectConcept{sctid:58800005})<-[:ISA*]-(d:ObjectConcept) 
> WHERE NOT (b)-->()--(c) AND NOT (b)-->()-->(d) 
> RETURN distinct b 
> UNION  
> MATCH p = (a:ObjectConcept{sctid:233604007}) <-[:ISA*]- (b:ObjectConcept), t 
> = (e:ObjectConcept{sctid:65119002})<-[:ISA*]-(f:ObjectConcept) 
> WHERE NOT (b)-->()-->(e) AND NOT (b)-->()-->(f) 
> RETURN distinct b
> 
> The correct results returns in 20 seconds vs. 20 minutes...a huge 
> improvement, but I am sure that Neo can do better...with better query design. 
>  
> 
> The goal of the query is to identify all distinct nodes in paths p, q, and h. 
>  With those distinct nodes, the identification of a relationship between 
> nodes(p) and nodes(q) and/or nodes(h) is desired. 
> 
> Thanks
> 
> 
> 
> On Friday, March 20, 2015 at 3:27:23 PM UTC-5, Scott Campbell wrote:
> I am working with an acyclic, directed graph (an ontology) that models human 
> health and am needing to identify certain diseases (example: Pneumonia) that 
> are infectious but NOT caused by certain bacteria (staph or streptococcus).  
> All concepts are Nodes defined as ObjectConcepts.  ObjectConcepts are 
> connected by relationships such as [ISA], [Pathological_process], 
> [Causative_agent], etc. 
> 
> The query requires:
> 
>  a) Identification of all concepts subsumed by the concept Pneumonia as 
> follows:
> 
> MATCH p = (a:ObjectConcept{Pneumonia}) <-[:ISA*]- (b:ObjectConcept)
> 
> b) Identification of all concepts subsumed by Genus Staph and Genus Strep 
> (including the concept Genus Staph and Genus Strep) as follows.  Note: 
> 
> with b MATCH (b) q = (c:ObjectConcept{Strep})<-[:ISA*]-(d:ObjectConcept), h = 
> (e:ObjectConcept{Staph})<-[:ISA*]-(f:ObjectConcept) 
> 
> c) Identify all nodes(p) that do not have a causative agent of Strep (i.e., 
> nodes(q)) or Staph (nodes(h)) as follows:
> 
> with b,c,d,e,f MATCH (b),(c),(d),(e),(f) WHERE (b)--()-->(c) OR 
> (b)-->()-->(d) OR (b)-->()-->(e) OR (b)-->()-->(f) RETURN distinct b.Name;
> 
> The query returns the correct result, but runs for 20 min.  However, running 
> the same query without Strep or Staph concepts, the query returns a correct 
> result in < 0.5 seconds.  XOR operators also work when and/or results are 
> needed, but speed is still an issue.
> 
> I am new to cypher, but I am sure that there is a more efficient query for 
> this problem.  I am unsure if/how collections and lists could be employed and 
> improve run times.  Suggestions?
> 
> -- 
> You received this message because you are subscribed to the Google Groups 
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] 
> <mailto:[email protected]>.
> For more options, visit https://groups.google.com/d/optout 
> <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] Re: Creating efficient multiple match queries?

Reply via email to