To validate if OrientDB is the right fit for my project, I've created a
simple scenario:
- Vertex classes (2):
- *User*: represents a user
- *Tag*: represents a tag that a user can interact with (e.g. post a
comment using that tag). Contains a *name *property
- Edge classes (1):
- *UserUsedTag*: connects a user to a tag. Contains a *timestamp
*property
(which is indexed with NOTUNIQUE)
To sum it up:
*[V:User] ---[E:UserUsedTag]--> [V:Tag]*
I'm trying to build a query that will let me know which tags have been the
most popular in the past [x] minutes/hours/days/months...
As such, here is an example of the query I've got right now:
SELECT inV().name as name, COUNT(in) AS weight
FROM UserUsedTag
WHERE timestamp < date('2015-04-24 09:40:00')
GROUP BY in
ORDER BY weight DESC
The query works, and I get a proper result set:
name
weight
baseball6117soccer5003My problem is the performance:
*Query executed in 0.311 sec. Returned 2 record(s)*
If it takes 1/3 of a second to sift through ~11,000 results, I can only
imagine how crippled the performance will be if I am dealing with millions
of edges, which I expect to end up with. In fact, as I add new edges, it
seems like the query time increases linearly.
Here is the EXPLAIN for the above query:
METADATAPROPERTIES
@version
resultSize
fullySortedByIndex
documentAnalyzedCompatibleClass
recordReads
fetchingFromTargetElapsed
indexIsUsedInOrderBy
compositeIndexUsed
current
documentReads
projectionElapsed
limit
orderByElapsed
evaluated
groupByElapsed
user
elapsed
resultType
involvedIndexes
02false1112011120313false1#14:11120
<http://172.16.11.2:2480/studio/index.html#/database/TestOrientDb/browse/edit/14:11120>
111206-10111200#5:0
<http://172.16.11.2:2480/studio/index.html#/database/TestOrientDb/browse/edit/5:0>
353.77362collection["UserUsedTag.timestamp"]It seems that
'fetchingFromTargetElapsed' is the biggest bottleneck here. Is this due to
the nature of my query? Perhaps my query is not optimal? I am new to graph
DBs so I'd like to know if there is any way I can rephrase my query to end
up with the same result.
The problem for me is that if I was using a relational DB, I would probably
get better performance, if I was to query my "link" table and, with my
final two (2) rows, perform one (1) query for each to retrieve the name of
the corresponding Tag via the foreign key. Is there any way I can split my
query similarly with OrientDB, instead of fetching the same Tag name
multiple times?
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.