Re: [orientdb] No index use when using sub query

Jamie Blair Fri, 06 Mar 2015 09:06:07 -0800

Luca,

So I'll try and give you a full test case early next week. But basically
what I'm having trouble with is that its fine to put a index on the sub
query and that will make that part faster. But if the sub query were to
return 3000 rows the outer part of the query (where foo = "bar") will never
use the index on 'foo', and becomes very slow +1second.


I'll give you a full test case next week, however given an index on 'foo'
and 'status' should that query be fast? Or am I misunderstanding something?

Thanks,
Jamie
On 6 Mar 2015 16:38, "Luca Garulli" <[email protected]> wrote:

> Hi Jamie,
> To speed up the query like:
>
>    select from (select expand(in()) from User where status = 'active')
> where foo = "bar" LIMIT 10
> You should have an index on User.status to avoid scanning all User. Can
> you elaborate more the query that is slow?
>
> Lvc@
>
>
> On 5 March 2015 at 22:28, Jamie Blair <[email protected]> wrote:
>
>> Luca,
>>
>> So I had problems with this approach if I needed to select on a larger
>> collection rather that a single rid. So something along the lines of
>>
>>    select from (select expand(in()) from User where status = 'active')
>> where foo = "bar" LIMIT 10
>>
>> I ran into problems I'm assuming I'm not quite think in the correct way.
>> Also I know I could put a limit on in inner query also but that would mean
>> that I could potentially end up with less than the required number of
>> results.
>>
>> Hope that makes sense
>>
>> Thanks for the help,
>> Jamie
>> On 5 Mar 2015 19:48, "Luca Garulli" <[email protected]> wrote:
>>
>>> Hi Jamie,
>>> First, OrientDB supports direct loading by RID at O(1) cost. So this:
>>>
>>> SELECT FROM SomeVertex WHERE @rid in [#23:83354, #23:366, #23:99933,
>>> #23:80708, #23:70291]
>>>
>>> Can be translated to:
>>>
>>>
>>> *SELECT FROM [#23:83354, #23:366, #23:99933, #23:80708, #23:70291]*
>>> That is MUCH faster. Don't put indexes on RIDs: RIDs are physical
>>> positions and are the reason why OrientDB is so fast on traversing and
>>> direct loading by RID.
>>>
>>> Then, in this query you're thinking relational:
>>>
>>> SELECT FROM SomeVertex WHERE @rid in (SELECT in FROM SomeEdge WHERE out
>>> IN #27:819)
>>>
>>> Try this (it should take few ms):
>>>
>>> *SELECT expand(in()) FROM #27:819*
>>>
>>> If you want to filter by vertex's properties you can do:
>>>
>>> *SELECT FROM (*
>>>
>>> *  SELECT expand(in()) FROM #27:819) WHERE age >= 18*
>>>
>>> Lvc@
>>>
>>>
>>> On 3 March 2015 at 11:42, Jamie Blair <[email protected]> wrote:
>>>
>>>> Luigi,
>>>>
>>>> Thanks, but there is an extra part to our problem as soon as we want to
>>>> put conditions on the Vertex we can't. So for our use case we have a lucene
>>>> index on the Vertex:name, I've found the expand to not be very useful
>>>> because it kind of backs me into a corner with adding conditions to my
>>>> query.
>>>>
>>>> Is there a list of the current issues with the query optimizer
>>>> anywhere? Also if anybody could give me an example of how I would also add
>>>> a lucene condition to the above query that'd be greatly appreciated
>>>>
>>>> Thanks,
>>>> Jamie
>>>>
>>>>
>>>> On Tuesday, March 3, 2015 at 7:21:34 AM UTC, Luigi Dell'Aquila wrote:
>>>>>
>>>>> Hi Jamie,
>>>>>
>>>>> yes, the right thing to make it faster is this:
>>>>>
>>>>> select expand(inV()) FROM SomeEdge WHERE out IN #27:819
>>>>>
>>>>> Currently there are no work arounds for that at parser level.
>>>>> The problem is not tracked as a single issue because it's a wide
>>>>> topic, and it's being addressed as a full development process, but if you
>>>>> want you can create a specific issue for this
>>>>>
>>>>> Regards
>>>>>
>>>>> Luigi
>>>>>
>>>>>
>>>>> 2015-03-02 18:11 GMT+01:00 Jamie Blair <[email protected]>:
>>>>>
>>>>>> Luigi,
>>>>>>
>>>>>> Have you any idea how I would make this fast, or is there a work
>>>>>> around for the present query optimizer? Also is there a ticket I can 
>>>>>> track
>>>>>> in github for this?
>>>>>>
>>>>>> Thanks,
>>>>>> Jamie
>>>>>>
>>>>>> On Monday, March 2, 2015 at 4:55:51 PM UTC, Luigi Dell'Aquila wrote:
>>>>>>>
>>>>>>> Hi Jamie,
>>>>>>>
>>>>>>> this is a known issue, we are working hard on the new query parser
>>>>>>> and executor and one of the main goals of all this is query 
>>>>>>> optimization.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Luigi
>>>>>>>
>>>>>>>
>>>>>>> 2015-03-02 17:11 GMT+01:00 Jamie Blair <[email protected]>:
>>>>>>>
>>>>>>>> The following query returns a set of `@rid` and completes in about
>>>>>>>> `0.012sec`
>>>>>>>>
>>>>>>>> SELECT in FROM SomeEdge WHERE out IN #27:819
>>>>>>>>
>>>>>>>> Now if I were to select from another Vertex using those `@rid`s in
>>>>>>>> there literal form, this would take a very long time and I get a 
>>>>>>>> timeout
>>>>>>>> (above 4seconds). I'm presuming its scanning over all the entries
>>>>>>>>
>>>>>>>> SELECT FROM SomeVertex WHERE @rid in [#23:83354, #23:366,
>>>>>>>> #23:99933, #23:80708, #23:70291]
>>>>>>>>
>>>>>>>> Interestingly enough if I were to use a single `@rid` rather than
>>>>>>>> a set it would be fast. So I'm assuming the query optimizer has the 
>>>>>>>> scope
>>>>>>>> to go a little further and also optimize multiple results
>>>>>>>>
>>>>>>>> SELECT FROM SomeVertex WHERE @rid in [#23:83354]
>>>>>>>>
>>>>>>>> But not to worry because I can make this faster by adding an index
>>>>>>>> to `SomeVertex.@rid` so now this is fast again
>>>>>>>>
>>>>>>>> CREATE INDEX foo on SomeVertex (@rid) unique
>>>>>>>> SELECT FROM SomeVertex WHERE @rid in [#23:83354,#23:366,#23:99933,#
>>>>>>>> 23:80708,#23:70291]
>>>>>>>>
>>>>>>>> But if I compose the 2 queries, I'd expect this to be fast, but
>>>>>>>> it's still slow and causes timeout (above 4 seconds)
>>>>>>>>
>>>>>>>> SELECT FROM SomeVertex WHERE @rid in (SELECT in FROM SomeEdge
>>>>>>>> WHERE out IN #27:819)
>>>>>>>>
>>>>>>>> I'm assuming I could write this in another way, but I'm more
>>>>>>>> interested in why this is slow. Is this a bug or if not are there any
>>>>>>>> details on how/why this is slow?
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "OrientDB" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>  --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to a topic in the
>>> Google Groups "OrientDB" group.
>>> To unsubscribe from this topic, visit
>>> https://groups.google.com/d/topic/orient-database/-U6IZNLAtSQ/unsubscribe
>>> .
>>> To unsubscribe from this group and all its topics, send an email to
>>> [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>  --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>  --
>
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "OrientDB" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/orient-database/-U6IZNLAtSQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] No index use when using sub query

Reply via email to