[orientdb] Slow performance when building a weighted table out of edge-vertex relationships

Jean-Sebastien Lemay Fri, 24 Apr 2015 04:14:44 -0700

To validate if OrientDB is the right fit for my project, I've created a 
simple scenario:


   - Vertex classes (2):
      - *User*: represents a user
      - *Tag*: represents a tag that a user can interact with (e.g. post a 
      comment using that tag). Contains a *name *property
      
      - Edge classes (1):
      - *UserUsedTag*: connects a user to a tag. Contains a *timestamp 
*property 
      (which is indexed with NOTUNIQUE)
   
To sum it up:
*[V:User] ---[E:UserUsedTag]--> [V:Tag]*

I'm trying to build a query that will let me know which tags have been the 
most popular in the past [x] minutes/hours/days/months...
As such, here is an example of the query I've got right now:
SELECT inV().name as name, COUNT(in) AS weight 
FROM UserUsedTag 
WHERE timestamp < date('2015-04-24 09:40:00') 
GROUP BY in 
ORDER BY weight DESC

The query works, and I get a proper result set:
name
weight
baseball6117soccer5003My problem is the performance:
*Query executed in 0.311 sec. Returned 2 record(s)*
If it takes 1/3 of a second to sift through ~11,000 results, I can only 
imagine how crippled the performance will be if I am dealing with millions 
of edges, which I expect to end up with. In fact, as I add new edges, it 
seems like the query time increases linearly.

Here is the EXPLAIN for the above query:
METADATAPROPERTIES
@version
resultSize
fullySortedByIndex
documentAnalyzedCompatibleClass
recordReads
fetchingFromTargetElapsed
indexIsUsedInOrderBy
compositeIndexUsed
current
documentReads
projectionElapsed
limit
orderByElapsed
evaluated
groupByElapsed
user
elapsed
resultType
involvedIndexes
02false1112011120313false1#14:11120 
<http://172.16.11.2:2480/studio/index.html#/database/TestOrientDb/browse/edit/14:11120>
111206-10111200#5:0 
<http://172.16.11.2:2480/studio/index.html#/database/TestOrientDb/browse/edit/5:0>
353.77362collection["UserUsedTag.timestamp"]It seems that 
'fetchingFromTargetElapsed' is the biggest bottleneck here. Is this due to 
the nature of my query? Perhaps my query is not optimal? I am new to graph 
DBs so I'd like to know if there is any way I can rephrase my query to end 
up with the same result.

The problem for me is that if I was using a relational DB, I would probably 
get better performance, if I was to query my "link" table and, with my 
final two (2) rows, perform one (1) query for each to retrieve the name of 
the corresponding Tag via the foreign key. Is there any way I can split my 
query similarly with OrientDB, instead of fetching the same Tag name 
multiple times?

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[orientdb] Slow performance when building a weighted table out of edge-vertex relationships

Reply via email to