Hello people! I would like to hear your opinion if OrientDB has some specific kind of sharding on sub-graph level (and I mean not sharding by records, but grouping records of different class/type on the same node by the "subgraph id")
Some background: We are evaluating an option to migrate our existing service to graph DB in AWS. At the moment we have relational DB living in our own data center. The user base is about 10 million users. Our domain has a number of entities - User (main entity), Device, Purchase, Account, etc. (secondary entities) Each entity has like 20-50 attributes and relations to other entities: mostly relations from main entity User to secondary entities (on average up to 1000 relations) and some occasional relations between secondary entities. Ideally, User entities could also be connected to each other, but we can sacrifice that for now. This domain is very well represented by graph structure (on paper) and we consider OrientDb as an option to store and query our data (it looks very promising). However we can't fit the whole graph on one node, so we have to deal with sharding of data. I realize that having a graph distributed across multiple nodes is slower than graph on one machine if all vertices could be interconnected. But our current needs for querying could be limited by working with subgraph for particular User - fetching different relations from User (found by id) - so it's kind of detached subgraph for a user, which we would like to shard. So ideally when user logs in to our app server, the app server should traverse such user's subgraph, so that we can benefit from db caching on the same node (assuming the same user is always directed to the same app server node) so there is little communication with other nodes. So the app server node A caches user subgraphs for users 1,2,3 and app server node B caches user subgraph for users 4,5,6, etc. - a kind of sticky sharding of subgraph. However l'm not sure if this approach could be implemented using OrientDB sharding strategies... What I could see there is a sharding on record level (i.e. User or Purchase), but I would like to see all records (of class User, Device, Purchase, etc.) which are "tagged" with userId to be on the node. Is it possible? What kind of sharding strategy do you recommend? Or do we have to build another distributed in-memory database on top of Orient for caching? thank you in advance! Anton -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
