I'm interested  in how to best store edges (relationships, e.g.
follow) between nodes (users), optimizing for response time on read.
I'll do some benchmarking but I'd like some advice on avoiding big O
horse pucky down the line, which kinda sorta requires insider
knowledge of how the datastore works.

To generate a facebook or twitter style page, one must first identify
the current user, then identify who the current user follows. For each
of the followed, one must get their data, subject to other constraints
like age or even limits on the number of results.

It seems like the first problem, finding followed users, is pretty
straight-forward. Whether you use global node and edge tables, or if
you maintain per-node inbound/outbound edge tables doesn't have a lot
of affect. The query is going to be fast.

However, the second problem looks tricky to me. Lets say that the data
are short status updates called "tweets" of Kind "Tweet". Each tweet
has a property "owner" which points to the user which created the
tweet. To find all tweets of all followed users then requires a query
with lots of user's "OR'd" together. This is the query that scares me
a little bit.

Should I be concerned about this kind of query? How will it scale as
the content increases? How will it scale with number of followed?

Thanks,
Josh

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to