Re: [orientdb] Querying for vertices that are linked to two or more collections of vertices

Luca Garulli Fri, 24 Jan 2014 15:48:10 -0800

Cool!
What's the execution time now?

To speed up things create an index (hashindex) against A.aField and
C.cField.


Why do you filter by class B (and C) ? Do you have other classes connected
as incoming from A ?

Lvc@



On 25 January 2014 00:15, Nick Bauer <[email protected]> wrote:

> Thanks!
>
> Based on your starting point, this query seems to work:
>
>  select from (
>   select expand(in('D') ) from A where aField = 'a123'
> ) where @class = 'B' and out('D') contains ( @class = 'C' and cField =
> 'c456' )
>
>
> On Friday, January 24, 2014 3:47:25 PM UTC-7, Lvc@ wrote:
>
>> Hi,
>> you could start with something like:
>>
>> select from (
>>   select expand( in('D') ) from A where aField = 'a456'
>> ) where in('D') contains ( cField = 'c123' )
>>
>> The idea is: you start from A sub set and get all the incoming B vertices
>> (connected as IN with edge 'D'). Then you filter only the instances that
>> have incoming C vertices that are in the other subset.
>>
>> Let me know if works.
>> Lvc@
>>
>>
>>
>> On 24 January 2014 23:38, Nick Bauer <[email protected]> wrote:
>>
>>> The graph looks like this:
>>>
>>> A <--- B ---> C
>>>
>>> Basically, I want to start with a subset of A vertices and a subset of C
>>> vertices, and I want to get back the subset of B vertices that is linked to
>>> at least 1 vertex in each of the starting subsets.
>>>
>>> I actually just tried combining the two queries above, which seems to
>>> work a lot better:
>>> select * from (
>>>     traverse out from (
>>>         traverse in_D from (
>>>             select * from A where aField = 'a456'
>>>         )
>>>     )
>>> )
>>> where @class = 'B' and (
>>>     out_D traverse(0,1) (
>>>         @class = 'C' and cField = 'c123'
>>>     )
>>> )
>>>
>>> Is there a better way to query for those B vertices?
>>>
>>> Thanks!
>>>
>>>
>>> On Friday, January 24, 2014 3:03:25 PM UTC-7, Lvc@ wrote:
>>>
>>>> Hi Nick,
>>>> can you explain your domain in terms of a graph? Like:
>>>>
>>>> A ---> B <---- C
>>>>
>>>> And where you start and what you want to get back?
>>>>
>>>> Lvc@
>>>>
>>>>
>>>>
>>>> On 24 January 2014 22:00, Nick Bauer <[email protected]> wrote:
>>>>
>>>>> We are using OrientDB 1.6.4.
>>>>>
>>>>> We have a graph with 3 classes of vertices: A, B, and C
>>>>> Class A has a medium multiplicity (tens of thousands of vertices)
>>>>> Class B has a high multiplicity (hundreds of thousands of vertices)
>>>>> Class C has a low multiplicity (a few vertices)
>>>>>
>>>>> We have also defined an edge class, D, which is used to link vertices
>>>>> of all three classes.
>>>>>
>>>>> Every instance of class B is linked to at least one instance of A and
>>>>> at least one instance of C.  Also, the links from the B instances are
>>>>> fairly evenly distributed among all of the A instances and among all of 
>>>>> the
>>>>> C instances.
>>>>>
>>>>> Thus, there are a relatively small number of B instances linked to any
>>>>> single A instance.  There are a much higher number of B instances linked 
>>>>> to
>>>>> any single C instance.
>>>>>
>>>>> We want to find all instances of class B that are linked to at least
>>>>> one A instance that has a certain field value and that are also linked to
>>>>> at least one instance of C that has a certain field value.
>>>>>
>>>>>
>>>>> We can solve half of the problem with a query like this, which
>>>>> traverses from the A instances that match the condition, and finds all of
>>>>> the B instances linked to them:
>>>>> select * from (traverse out from (traverse in_D from (select * from A
>>>>> where aField = 'a456') )) where @class = 'B'
>>>>>
>>>>> This query quickly returns a relatively small number of B instances to
>>>>> work with, so it seems beneficial to start there.  However,
>>>>> from there, we can continue to traverse out to the C instances linked
>>>>> to those B instances, but it seems like if we do that, we lose the ability
>>>>> to further filter out those B instances based on the C instances they link
>>>>> to.
>>>>>
>>>>> Is there a way to filter the B instances based on the C instances they
>>>>> link to, without traversing to the C instances themselves?
>>>>>
>>>>> We have also found that this query works, but given the high
>>>>> multiplicity of class B, it seems to take a very long time (around 10
>>>>> minutes) to run:
>>>>> select from B where ((out_D traverse(0,1) (@class = 'C' and cField =
>>>>> 'c123')) and (out_D traverse(0,1) (@class = 'A' and aField = 'a456')))
>>>>>
>>>>> Is there a better way to query for vertices based on their links to
>>>>> multiple collections of vertices?
>>>>>
>>>>> --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OrientDB" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>>
>>>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>>>
>>>>
>>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/groups/opt_out.
>>>
>>
>>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: [orientdb] Querying for vertices that are linked to two or more collections of vertices

Reply via email to