[orientdb] Re: Optimising Paginated Queries - Advise on best methodoloy

Sky Sat, 15 Aug 2015 13:35:17 -0700

Hey

I would like to post my own solution ;)


I found a very good page in OrientDB docs that I had missed

https://orientdb.com/docs/last/Pivoting-With-Query.html

This gave me the a-ha moment. Work form the smallest subset, and join out. 
This is somewhat a different conceptual model to a RDMS where one would try 
to avoid many nested subqueries.

Taking this on board, and using the example given to be clear on syntax, I 
wrote something like this based on single edge, with a label of the action, 
and createdAt stamp (so hybrid of designs 1 and 2):

SELECT actions, lastUpdated, Film.title, Film.watched_count as 
watched_count FROM (
  SELECT LIST(action) as actions, MAX(createdAt) AS lastUpdated, in AS Film 
from (
    SELECT  flatten(outE('filmAction')) AS actions FROM #12:25903 )
  GROUP BY in 
)
WHERE "owns" IN actions
ORDER BY lastUpdated DESC

The key here, is that I decided to use cache counters for the film totals. 
Without that, it's painful

Query time runs insanely fast < 100ms over 1 million filmActions (this 
query finds every film a user has actioned, and brings back the film data, 
along with the actions they user has made against that film, and the films 
total action counters)

This design also allows for the LIMIT, ORDER BY and if need be, WHERE 
clause on the action type. Set operations I was struggling with until I 
read that OrientDB section.

So..happy...but not quite. I wanted to go back to my Design 1, where the 
edge is direct, not discriminated by a label and avoid the need for cache 
counters

SELECT actions, lastUpdated, Film.title, Film.in_loved.size() AS loved, 
Film.in_owns.size() AS owns, Film.in_watched.size() AS watched FROM (
  SELECT LIST(@class) as actions, MAX(createdAt) AS lastUpdated, in AS Film 
from (
    SELECT  flatten(outE('loved','owns','watched')) AS actions FROM 
#12:25903 )
  GROUP BY in 
)
WHERE "owns" IN actions
ORDER BY loved DESC
LIMIT 10

this is giving me around ~230ms over 600k edges. So, whilst this is a more 
purist approach to Graph design, I do feel it won't scale as well already 
after prelim studies.

So, I'm now down to choosing between these 2 designs. If anyone can let me 
know how I can improve the last query here, then I would choose that 
design, but I think for now, I will use counters 

Thanks

Sky

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[orientdb] Re: Optimising Paginated Queries - Advise on best methodoloy

Reply via email to