Hi, thanks for looking into this situation, it's really appreciated!
I repopulated my database and ran your query:
SELECT inV().name AS name, COUNT(*) AS weight FROM (
SELECT expand(outE("UserUsedTag"))
FROM user
WHERE username = "john.doe"
)
WHERE timestamp > date('2015-04-27 15:13:00')
AND timestamp < date('2015-04-27 15:17:00')
GROUP BY in
ORDER BY weight DESC
Query executed in 0.474 sec. Returned 2 record(s)
baseball 5013
soccer 4987
Here is what the EXPLAIN returns:
METADATAPROPERTIES
@version
resultSize
current
recordReads
rangeQueryConvertedInBetween
limit
fetchingFromTargetElapsed
projectionElapsed
documentReads
orderByElapsed
evaluated
groupByElapsed
user
elapsed
resultType
expandElapsed
02#14:24599
<http://172.16.11.2:2480/studio/index.html#/database/TestOrientDb/browse/edit/14:24599>
100001-12727100000100000#5:0
<http://172.16.11.2:2480/studio/index.html#/database/TestOrientDb/browse/edit/5:0>
445.29556collection3
My schema is as follows:
<https://lh3.googleusercontent.com/-CtHv7envQbM/VT5T0kmvKbI/AAAAAAAAANs/xXG5AoAUKoU/s1600/schema_user.jpg>
(User extends V, UNIQUE_HASH_INDEX on User.username)
<https://lh3.googleusercontent.com/-8COWToHTE7Y/VT5UAs0WZBI/AAAAAAAAAN0/1vnRx2jtBaU/s1600/schema_tag.jpg>
(Tag extends V, UNIQUE_HASH_INDEX on Tag.name)
<https://lh3.googleusercontent.com/-Ma25A-g4A6k/VT5UIcKtBpI/AAAAAAAAAN8/nxipEy7_0fU/s1600/schema_user_used_tag.jpg>
(UserUsedTag extends E, NOTUNIQUE index on UserUsedTag.timestamp)
Let me know if you can determine what causes my query to take almost half a
second. Hopefully we uncover something I'm doing wrong -- the database is
great beyond this particular issue.
Best regards,
Jean
On Monday, April 27, 2015 at 6:21:04 PM UTC+8, Izzet Pembeci wrote:
>
> This is surprizing indeed. Can't you put the timestamp where clause just
> before outer group by so that you eliminate one subquery:
>
> SELECT inV().name AS name, COUNT(*) AS weight FROM (
>
> SELECT expand(outE("UserUsedTag"))
> FROM user
> WHERE username = "john.doe"
> )
>
> WHERE timestamp > date('2015-04-25 05:00:22')
> AND timestamp < date('2015-04-25 05:01:10')
> GROUP BY in
> ORDER BY weight DESC
>
> What does the EXPLAIN return for this latest one and can you share your
> index definitions (may be there is a problem with those)?
>
> iZzeT
>
>
> On Sunday, April 26, 2015 at 4:58:11 PM UTC+3, Jean-Sebastien Lemay wrote:
>>
>> Hi Luca,
>>
>> Thanks for the advice. However in your query I am unable to filter by
>> timestamp on the edge. Instead, I ended up with:
>> SELECT inV().name AS name, COUNT(*) AS weight FROM (
>>
>> SELECT FROM (
>> SELECT expand(outE("UserUsedTag"))
>> FROM user
>> WHERE username = "john.doe"
>> ) WHERE timestamp > date('2015-04-25 05:00:22')
>> AND timestamp < date('2015-04-25 05:01:10')
>> )
>> GROUP BY in
>> ORDER BY weight DESC
>> soccer 5020
>> baseball 4939
>> Query executed in 0.468 sec. Returned 2 record(s)
>>
>> The above query seems to take into account all indexes (index on
>> User.username and UserUsedTag.timestamp).
>>
>> Still, 468ms for ~10k records. Am I correct then that OrientDB in the end
>> is a tradeoff, i.e. it's a lot more developer friendly and has tons of
>> features, but at the cost of rougly ~40x the performance of a traditional
>> relational DB such as PostgreSQL?
>>
>>
>>
>> On Sunday, April 26, 2015 at 6:33:38 PM UTC+8, l.garulli wrote:
>>>
>>> On 25 April 2015 at 16:41, Jean-Sebastien Lemay <[email protected]
>>> > wrote:
>>>
>>>> Hi there,
>>>>
>>>
>>> Hi Jean-Sebastien
>>>
>>>
>>>>
>>>> OK -- I'm a bit surprised, in a way; this is a straightforward SELECT
>>>> statement.
>>>>
>>>> I decided to run another experiment. My scenario: determine which tags
>>>> are a given user's favourite.
>>>>
>>>> My query in OrientDB:
>>>> SELECT in.name AS name, COUNT(in.name) AS weight
>>>> FROM UserUsedTag
>>>> WHERE "john.doe" IN out.username
>>>> AND timestamp < date()
>>>> GROUP BY in.name
>>>> ORDER BY weight DESC
>>>> Query executed in 0.487 sec. Returned 2 record(s)
>>>> soccer 5042
>>>> baseball 4958
>>>>
>>>
>>> This is not the most efficient way to do that, because the index on
>>> username wouldn't be used. With a Graph Database, you should lookup at the
>>> stating vertex and then cross relationships. Try this:
>>>
>>> select name, count(*) as weight
>>> from (
>>> select expand( in() ) from User where username = "john.doe"
>>> ) group by name, order by weight
>>>
>>> Assure to have an index on User.username. I'd suggest a Unique index of
>>> type HASH_INDEX.
>>>
>>> Lvc@
>>>
>>>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.