I'm interested in how to best store edges (relationships, e.g. follow) between nodes (users), optimizing for response time on read. I'll do some benchmarking but I'd like some advice on avoiding big O horse pucky down the line, which kinda sorta requires insider knowledge of how the datastore works.
To generate a facebook or twitter style page, one must first identify the current user, then identify who the current user follows. For each of the followed, one must get their data, subject to other constraints like age or even limits on the number of results. It seems like the first problem, finding followed users, is pretty straight-forward. Whether you use global node and edge tables, or if you maintain per-node inbound/outbound edge tables doesn't have a lot of affect. The query is going to be fast. However, the second problem looks tricky to me. Lets say that the data are short status updates called "tweets" of Kind "Tweet". Each tweet has a property "owner" which points to the user which created the tweet. To find all tweets of all followed users then requires a query with lots of user's "OR'd" together. This is the query that scares me a little bit. Should I be concerned about this kind of query? How will it scale as the content increases? How will it scale with number of followed? Thanks, Josh -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
