Susan, as far as I understand your model and your procedures, Google Cloud Datastore should be a good choice. My suggestions are based on my experience with Python and the NDB API, but I assume there is no significant difference.
Since your app maintains hard limits in the number of friendships and friend requests, I suggest to tweak your model a little bit, so you can apply strong consistent queries (which require an ancestor, i.e. a parent). By doing this, we put the Friend objects into so-called entity groups, where each parent forms its own group. FRIEND -parent (Key of the owning user) -id (auto-generated) -friend1 -friend2 -status This way, the app can perform a strongly consistent, keys-only query on both, number of friendships and number of pending friend requests. I guess, both queries would happen rather often (each time a friend request is created or friendship is accepted). Furthermore, you already suggested two have two objects, one for each side of the friendship, good thinking. Since you already have the key of friend1 as parent, we could remove it from the properties, too. In addition, we could use the ID of the friend as ID of this relationship to avoid inconsistency by duplicate relationships. (I assume all User keys are parent-less, so the IDs already guarantee unique keys for all users in the datastore). In general, it is cheaper and faster to lookup for a key, than doing a query. This way, the existence of a key would tell us, if there is already a Friend object. The app also can compute the Friend key and directly get the object, rather than always first querying. And since we already have the friend's key (by the ID), we can also remove the property friend2. FRIEND -parent (Key of the friend1) -id (ID of friend2's key) -status The status must be indexed though to perform the counts on requests and existing friendships. Any creation, update, and deletion of Friend objects would be in transactions across two groups (of friend1 and friend2), so both legs of the relationship are consistent. In my experience using webapp2 for user authentication, it has benefits to keep the actual user account (the kind used for authentication) out of anything else in the app. So I would not use User keys as parent or for the friend key-property, but instead: computed keys with the same ID as the user (for uniqueness and consistency) and a different kind. Parents don't need to exist. FRIEND -parent (Key of kind FriendParent, ID the same as of corresponding user) -id (ID of friend2) -status For example, two User objects stored, id:1 and id:2. As result of a friend request from 1 to 2, these two objects are created: FRIEND- parent: KEY(FriendParent, 1)- id: 2- status: not_accepted_yet FRIEND- parent: KEY(FriendParent, 2)- id: 1- status: new_request_to_accept The parent keys don't exist as objects in the datastore, They only exist as parent keys in the Friend objects to put them into entity groups, so the app can apply strong consistent ancestor queries for each user. Downside of the variant with parent: There is a technical limitation how often the app (or any other client) can write into the same entity-group (1/sec). In this model, it would give a hard limit how often each user can: - send/revoke friend request (or by the friend) - accept/deny request (or by the friend) - remove friend (or by the friend) In other words: With parents in Friend, datastore can not maintain huge numbers of friendships, which implies high frequency of such write-ops, like followers at Twitter, channel subscriptions on YouTube etc. However, considering the rather low hard limits (200/100) I thought this constraint doesn't matter. If it matters though, we cannot put Friend into bigger entity-groups, but then also loose the ability to perform strong consistent queries (which requires ancestor/parent). An eventually consistent ancestor-less query (for example, when the app counts number of requests for a user) may miss an entity that was written just milliseconds / seconds before, so it could be that the limit is slightly exceeded for some users. In that case, I would suggest to keep Friend in their own entity-group: FRIEND -id (similar to mutual_id: friend1_ID:friend2_ID) -friend1 -friend2 -status The app can still make the transactional writes with both. With this little tweak the app could at least get Friend by key in more use-cases than with an auto-generated ID. I will continue my suggestion with the parent-version of Friend. You have mentioned the display of names. As a general rule of thumb (and I would think that many Datastore users follow this rule), you do less data normalization than in SQL databases, in lack of join queries and such. As far as it is only names, I would think that user names don't change frequently, so I would add the friend's name to the Friend model, so we don't need to query for the current name of up to 300 users every time we show the list of friends and friend requests (that would double the reads). If a user changes the name, we would need to update all Friend objects with this user. Given the parent-variant of Friend, we can do this strongly consistent. Perform an ancestor query of all Friend objects owned by this user (or the FriendParent object), compute the keys of the mirrored Friend objects, and batch update them to the new name in a few separate transactions. As no limitations to the number of emails have been mentioned, this could be pretty heavy on writes. Maybe a few thousands emails to touch for each name change? And this would be needed for both, sender or recipient. Furthermore, what about avatar images of users or other profile information? They may change more frequently. I think it's difficult to make forecasts on all the scenarios, so you could decide which approach would be cheaper. Probably it is safe to assume that the name and the avatar won't change often later on, so it makes sense to write them directly into Friend and maybe also EMail. FRIEND -parent (Key of kind FriendParent, ID the same as of corresponding user) -id (ID of friend2) -name (of friend2) -imgUrl (of friend2) -status Every other profile-related data should be stored into a separate kind, especially if it can change frequently (last seen, online status etc.). In the HTML templates or with some JS wizardry, the link to each user's profile can be computed easily even in a friend / request list, with-out actually reading a User object. Basically, a Friend object would contain everything the app needs for the most frequent requests. As I have mentioned earlier, I would separate the authentication-related data of a user from profile-related data, so instead of putting name, avatar etc. into the User kind, I would put it into a UserProfile kind, where the ID always is the same as of the corresponding user account. USERPROFILE -id (ID of USER) -name -imgUrl -about me (etc.) -status One last note regarding the ID of USER, I suggest to not use the email ID which can change, but datastore keys (ID is part of the key) are immutable. An auto-generated ID would be fine. For standard GAE environment there is the User API available, if you want to count on Google accounts or OpenID. I use a custom user management and authentication based on webapp2, but certainly, other frameworks also provide similar features. I've thought this is much safer and easier than implementing my own authentication features. There is so much that could go wrong. Well, I hope this helped you a little. Ani On Wednesday, April 13, 2016 at 1:22:52 AM UTC+2, Susan Lin wrote: > > > > down votefavorite > <http://stackoverflow.com/questions/36585664/model-datastore-application#> > > I am looking how to create an efficient model which will satisfy the > requirements I put below. I have tried using gcloud-node but have noticed > it has limitations with read consistencies, references, etc. I would prefer > to write this is nodejs, but would be open to writing in java or python as > long as it would improve my model. I am building around the new pricing > model which will come July 1st. > > My application consists of a closed email system. In essence what happens > is users register to the site. These user's can make friends. Then they can > send emails to each other. > > *Components of the app:* > > Users - Unlimited amount of users can join. > > Friends - A User can have 200 confirmed friends and 100 pending friend > requests. When a friendlist is retrieved it should show the name of the > friend. (I will also need to receive the id of the friends so I can use it > on my client side to create emails). > > Emails - Users can send emails to their friends and they can receive > emails from their friends. The user can then view all their sent emails > independently(sentbox) and all their received emails independently(inbox). > They can also view the the emails sent between themselves and a friend > order by newest. The emails should show the senders and receivers names. > Once an email is read it needs to be marked as read. > > My model looks something like this, but as you can see their are > inefficiencies. > > *Datastore Kinds:* > > USER > -email (id) //The email doesn't need to be the id, but I need to be able to > retrieve users by their email > -hash_password > -name > -account_status > -created_date > > FRIEND > -id (auto-generated) > -friend1 > -friend2 > -status > > EMAIL > -id (auto-generated) > -from > -to > -mutual_id > -message > -created_date > -has_seen > > *Procedures of the application:* > > *Register* - Get operation to see if a user with this email exists. If > does not insert key. > > *Login* - Get operation to get user based on email. If exists retrieve > the hash_password from the entity and compare to user's input. > > *Send friend request* - Friend data will be written twice for every > relationship. Then using the index on friend1 and index on status I will > query all the friends for a user and filter only those which are 'pending'. > I will then count these friends and see if they are over X. Again I will do > this for the other user. If they are both not over the pending limit, I > will insert the friend request. This needs to run in a transaction. > > *Accept a friend request* - Friend data will be written twice for every > relationship. Then using the index on friend1 and index on status I will > query all the friends for a user and filter only those which are pending. I > will then count these friends and see if they are over X. Again I will do > this for the other user. If they are both not over the pending limit, I > will change both entities's status to accepted as a transaction. > > *Show confirmed friends* - Friend data will be written twice for every > relationship. Then using the index on friend1 and index on status I will > query all the friends for a user and filter only those which are accepted. > Not sure how I will show the friend's names (e.g what happens if a user > changed their name this needs to be reflected in all friend relationships > and emails!). > > *Show pending friends* - Friend data will be written twice for every > relationship. Then using the index on friend1 and index on status I will > query all the friends for a user and filter only those which are pending. > Not sure how I will show the friend's names (e.g what happens if a user > changed their name this needs to be reflected in all friend relationships > and emails!). > > *View sent emails* - Using the index on the from property I would query > to get all the sent emails from a user 5 at a time ordered by created_date > (newest first). (e.g what happens if a user changed their name this needs > to be reflected in all friend relationships and emails!). > > *View received emails* - Using the index on the to property I would query > to get all the received emails to a user 5 at a time ordered by > created_date (newest first). When a emails is seen it will update that > entities has_seen property to true. (e.g what happens if a user changed > their name this needs to be reflected in all friend relationships and > emails!). > > *View emails between 2 users* - Using the index on mutual_id which is > based on [lower_lexicographic_email]:[higher_lexicographic_email] to query > the mutual emails. Ordered by newest, 5 at a time. (e.g what happens if a > user changed their name this needs to be reflected in all friend > relationships and emails!). > > *Create email* - Using the friend1 and status index I will confirm the > user's are friends. If they are friends, I will insert an email. > -- HATZIS Edelstahlbearbeitung GmbH Hojen 2 87490 Haldenwang (Allgäu) Germany Handelsregister Kempten (Allgäu): HRB 4204 Geschäftsführer: Paulos Hatzis, Charalampos Hatzis Umsatzsteuer-Identifikationsnummer: DE 128791802 GLN: 42 504331 0000 6 http://www.hatzis.de/ -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/1de70db8-b25f-41b6-a007-c97d79ff0ac8%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
