Cool! What's the execution time now? To speed up things create an index (hashindex) against A.aField and C.cField.
Why do you filter by class B (and C) ? Do you have other classes connected as incoming from A ? Lvc@ On 25 January 2014 00:15, Nick Bauer <[email protected]> wrote: > Thanks! > > Based on your starting point, this query seems to work: > > select from ( > select expand(in('D') ) from A where aField = 'a123' > ) where @class = 'B' and out('D') contains ( @class = 'C' and cField = > 'c456' ) > > > On Friday, January 24, 2014 3:47:25 PM UTC-7, Lvc@ wrote: > >> Hi, >> you could start with something like: >> >> select from ( >> select expand( in('D') ) from A where aField = 'a456' >> ) where in('D') contains ( cField = 'c123' ) >> >> The idea is: you start from A sub set and get all the incoming B vertices >> (connected as IN with edge 'D'). Then you filter only the instances that >> have incoming C vertices that are in the other subset. >> >> Let me know if works. >> Lvc@ >> >> >> >> On 24 January 2014 23:38, Nick Bauer <[email protected]> wrote: >> >>> The graph looks like this: >>> >>> A <--- B ---> C >>> >>> Basically, I want to start with a subset of A vertices and a subset of C >>> vertices, and I want to get back the subset of B vertices that is linked to >>> at least 1 vertex in each of the starting subsets. >>> >>> I actually just tried combining the two queries above, which seems to >>> work a lot better: >>> select * from ( >>> traverse out from ( >>> traverse in_D from ( >>> select * from A where aField = 'a456' >>> ) >>> ) >>> ) >>> where @class = 'B' and ( >>> out_D traverse(0,1) ( >>> @class = 'C' and cField = 'c123' >>> ) >>> ) >>> >>> Is there a better way to query for those B vertices? >>> >>> Thanks! >>> >>> >>> On Friday, January 24, 2014 3:03:25 PM UTC-7, Lvc@ wrote: >>> >>>> Hi Nick, >>>> can you explain your domain in terms of a graph? Like: >>>> >>>> A ---> B <---- C >>>> >>>> And where you start and what you want to get back? >>>> >>>> Lvc@ >>>> >>>> >>>> >>>> On 24 January 2014 22:00, Nick Bauer <[email protected]> wrote: >>>> >>>>> We are using OrientDB 1.6.4. >>>>> >>>>> We have a graph with 3 classes of vertices: A, B, and C >>>>> Class A has a medium multiplicity (tens of thousands of vertices) >>>>> Class B has a high multiplicity (hundreds of thousands of vertices) >>>>> Class C has a low multiplicity (a few vertices) >>>>> >>>>> We have also defined an edge class, D, which is used to link vertices >>>>> of all three classes. >>>>> >>>>> Every instance of class B is linked to at least one instance of A and >>>>> at least one instance of C. Also, the links from the B instances are >>>>> fairly evenly distributed among all of the A instances and among all of >>>>> the >>>>> C instances. >>>>> >>>>> Thus, there are a relatively small number of B instances linked to any >>>>> single A instance. There are a much higher number of B instances linked >>>>> to >>>>> any single C instance. >>>>> >>>>> We want to find all instances of class B that are linked to at least >>>>> one A instance that has a certain field value and that are also linked to >>>>> at least one instance of C that has a certain field value. >>>>> >>>>> >>>>> We can solve half of the problem with a query like this, which >>>>> traverses from the A instances that match the condition, and finds all of >>>>> the B instances linked to them: >>>>> select * from (traverse out from (traverse in_D from (select * from A >>>>> where aField = 'a456') )) where @class = 'B' >>>>> >>>>> This query quickly returns a relatively small number of B instances to >>>>> work with, so it seems beneficial to start there. However, >>>>> from there, we can continue to traverse out to the C instances linked >>>>> to those B instances, but it seems like if we do that, we lose the ability >>>>> to further filter out those B instances based on the C instances they link >>>>> to. >>>>> >>>>> Is there a way to filter the B instances based on the C instances they >>>>> link to, without traversing to the C instances themselves? >>>>> >>>>> We have also found that this query works, but given the high >>>>> multiplicity of class B, it seems to take a very long time (around 10 >>>>> minutes) to run: >>>>> select from B where ((out_D traverse(0,1) (@class = 'C' and cField = >>>>> 'c123')) and (out_D traverse(0,1) (@class = 'A' and aField = 'a456'))) >>>>> >>>>> Is there a better way to query for vertices based on their links to >>>>> multiple collections of vertices? >>>>> >>>>> -- >>>>> >>>>> --- >>>>> You received this message because you are subscribed to the Google >>>>> Groups "OrientDB" group. >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>> an email to [email protected]. >>>>> >>>>> For more options, visit https://groups.google.com/groups/opt_out. >>>>> >>>> >>>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "OrientDB" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/groups/opt_out. >>> >> >> -- > > --- > You received this message because you are subscribed to the Google Groups > "OrientDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
