Hello people!
I would like to hear your opinion if OrientDB has some specific kind of 
sharding on sub-graph level (and I mean not sharding by records, but 
grouping records of different class/type on the same node by the "subgraph 
id")

Some background:
We are evaluating an option to migrate our existing service to graph DB in 
AWS.
At the moment we have relational DB living in our own data center.
The user base is about 10 million users.
Our domain has a number of entities - User (main entity), Device, Purchase, 
Account, etc. (secondary entities)
Each entity has like 20-50 attributes and relations to other entities: 
mostly relations from main entity User to secondary entities (on average up 
to 1000 relations) and some occasional relations between secondary entities.
Ideally, User entities could also be connected to each other, but we can 
sacrifice that for now.
This domain is very well represented by graph structure (on paper) and we 
consider OrientDb as an option to store and query our data (it looks very 
promising).

However we can't fit the whole graph on one node, so we have to deal with 
sharding of data.
I realize that having a graph distributed across multiple nodes is slower 
than graph on one machine if all vertices could be interconnected.
But our current needs for querying could be limited by working with 
subgraph for particular User - fetching different relations from User 
(found by id) - so it's kind of detached subgraph for a user, which we 
would like to shard.
So ideally when user logs in to our app server, the app server 
should traverse such user's subgraph, so that we can benefit from db 
caching on the same node (assuming the same user is always directed to the 
same app server node) so there is little communication with other nodes.
So the app server node A caches user subgraphs for users 1,2,3 and app 
server node B caches user subgraph for users 4,5,6, etc. - a kind of sticky 
sharding of subgraph.
However l'm not sure if this approach could be implemented using OrientDB 
sharding strategies...
What I could see there is a sharding on record level (i.e. User or 
Purchase), but I would like to see all records (of class User, Device, 
Purchase, etc.) which are "tagged" with userId to be on the node.
Is it possible? What kind of sharding strategy do you recommend? Or do we 
have to build another distributed in-memory database on top of Orient for 
caching?

thank you in advance!
Anton

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to