Re: [orientdb] Slow performance when building a weighted table out of edge-vertex relationships

Izzet Pembeci Mon, 27 Apr 2015 03:22:12 -0700

This is surprizing indeed. Can't you put the timestamp where clause just 
before outer group by so that you eliminate one subquery:


SELECT inV().name AS name, COUNT(*) AS weight FROM (

  SELECT expand(outE("UserUsedTag")) 
  FROM user
  WHERE username = "john.doe"
 ) 

WHERE timestamp > date('2015-04-25 05:00:22')
 AND timestamp < date('2015-04-25 05:01:10')
GROUP BY in
ORDER BY weight DESC

What does the EXPLAIN return for this latest one and can you share your 
index definitions (may be there is a problem with those)?

iZzeT


On Sunday, April 26, 2015 at 4:58:11 PM UTC+3, Jean-Sebastien Lemay wrote:
>
> Hi Luca,
>
> Thanks for the advice. However in your query I am unable to filter by 
> timestamp on the edge. Instead, I ended up with:
> SELECT inV().name AS name, COUNT(*) AS weight FROM (
>
>  SELECT FROM (
>   SELECT expand(outE("UserUsedTag")) 
>   FROM user
>  WHERE username = "john.doe"
>  ) WHERE timestamp > date('2015-04-25 05:00:22')
>  AND timestamp < date('2015-04-25 05:01:10')
> )
> GROUP BY in
> ORDER BY weight DESC
> soccer 5020
> baseball 4939
> Query executed in 0.468 sec. Returned 2 record(s)
>
> The above query seems to take into account all indexes (index on 
> User.username and UserUsedTag.timestamp).
>
> Still, 468ms for ~10k records. Am I correct then that OrientDB in the end 
> is a tradeoff, i.e. it's a lot more developer friendly and has tons of 
> features, but at the cost of rougly ~40x the performance of a traditional 
> relational DB such as PostgreSQL?
>
>
>
> On Sunday, April 26, 2015 at 6:33:38 PM UTC+8, l.garulli wrote:
>>
>> On 25 April 2015 at 16:41, Jean-Sebastien Lemay <[email protected]> 
>> wrote:
>>
>>> Hi there,
>>>
>>
>> Hi Jean-Sebastien
>>  
>>
>>>
>>> OK -- I'm a bit surprised, in a way; this is a straightforward SELECT 
>>> statement.
>>>
>>> I decided to run another experiment. My scenario: determine which tags 
>>> are a given user's favourite.
>>>
>>> My query in OrientDB:
>>> SELECT in.name AS name, COUNT(in.name) AS weight 
>>> FROM UserUsedTag
>>> WHERE "john.doe" IN out.username
>>> AND timestamp < date()
>>> GROUP BY in.name
>>> ORDER BY weight DESC
>>> Query executed in 0.487 sec. Returned 2 record(s)
>>> soccer 5042
>>> baseball 4958
>>>
>>
>> This is not the most efficient way to do that, because the index on 
>> username wouldn't be used. With a Graph Database, you should lookup at the 
>> stating vertex and then cross relationships. Try this:
>>
>> select name, count(*) as weight
>> from (
>>   select expand( in() ) from User where username = "john.doe"
>> ) group by name, order by weight
>>
>> Assure to have an index on User.username. I'd suggest a Unique index of 
>> type HASH_INDEX.
>>
>> Lvc@
>>
>>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Slow performance when building a weighted table out of edge-vertex relationships

Reply via email to