Thanks Jeff. I think the I/O video I saw was from 2008, given by Bret, not
sure if it's the same one you're referencing (maybe I mixed them up) but it
does talk about the fan out problem for microblogging and gives a solution.

The solution in the video is comprehensible but two points stuck out for me
though:

1) You need to write all the recipients into each Message object. If the
author of the message has a large number of friends, writing can take a good
amount of time. This can be handled in the background using the task queue.
2) The Message.recipients list is essentially locked after publication. So
if I write a message today, and you become my follower tomorrow, my message
from yesterday won't show up in your feed (you weren't my friend when I
published the message). That's ok for twitter, but for building a histogram
it won't work because I need knowledge of all your current friends' ratings
for the product.

The first item is a little painful but has to be paid for somewhere, item 2
was a little more problematic. If I treat each product rating instance like
a tweet, and embed the recipients (the author's current friend network) -
then when friends are added or dropped, I need to go through and update
every past product rating instance with the updated friend listing.

I'll see if I can find the 2009 presentation and see if it's different than
the one I saw, thanks for your help,

Mark



On Mon, Aug 9, 2010 at 11:59 AM, Jeff Schwartz <[email protected]>wrote:

> You might want to catch the Google IO 09 vid on YouTube where fan-out is
> discussed. In the vid using listindex entities and key only queries are
> discussed as a way of defining and selecting large groups. If you can wrap
> you hands around the concepts and understand how the mentioned
> implementations works you will have your answer. It is doable but it isn't
> very pretty. The good part is that it provides very quick queries and
> eliminates serializing entities that are only used as indexes.
>
> Just my $0.02.
>
> Jeff
>
> On Mon, Aug 9, 2010 at 2:04 PM, Mark <[email protected]> wrote:
>
>> Hi,
>>
>> I have a web app where users can add friends, and can rate products.
>> The model looks like:
>>
>>    class User {
>>        String username;
>>    }
>>
>>    class Friend {
>>        String username;
>>        String usernameFriend;
>>    }
>>
>>    class ProductRating {
>>        String username;
>>        String productId;
>>        int rating; // 1 - 5
>>    }
>>
>> When a user is viewing a product, I want to show them a histogram of
>> the ratings all their friends gave the product. Since the histogram is
>> not valuable unless all information is known, this becomes difficult
>> to do at scale because I need to:
>>
>>  1) For the given user, load all their friend names (could be
>> hundreds or thousands).
>>  2) For each friend, check if any of them have given a rating for the
>> product of interest.
>>  3) Aggregate all friend ratings into a histogram.
>>
>> I'll probably timeout fetching deserializing all those objects on
>> steps 1 & 2. I can precompute histograms for each user for each
>> product as everyone submits ratings. This would optimize reads later
>> on but would really increase storage requirements and add additional
>> cpu use on every rating submission. As friend relationships change, I
>> would have to also update all precomputed histograms, which would be a
>> pain.
>>
>>
>>
>> I'm thinking of doing the following, and wondering how poor an idea it
>> is. The basic idea is to keep a flat Text object of a user's friends,
>> and a product's ratings to build histograms in application code,
>> either on the server or the clients themselves:
>>
>>  class User {
>>      String username;
>>  }
>>
>>  class UserFriends {
>>      String username;
>>      Text friends;
>>  }
>>
>>  class ProductRatings {
>>      String productId;
>>      Text ratings;
>>  }
>>
>> A user's friends string might look like:
>>
>>  UserFriends.friends = "kim,greg,jen,ed,friendN";
>>
>> A product's rating string might look like:
>>
>>  ProductRatings.ratings = "kim:4,tim:5,ed:2,usernameN:ratingN";
>>
>> so in order to build the histogram, I need to:
>>
>>  // get my flat friends string.
>>  select from UserFriends where username='myusername';
>>
>>  // get the flat ratings string for the product.
>>  select from ProductRatings where productId='xyz';
>>
>> Once I have both flat strings, I can generate the histogram in
>> application code. The idea is that I have a better chance of storing
>> all friends and ratings information in the flat Text objects and
>> fetching it in a single http connection than if I have if I were to
>> fetch all the individual objects.
>>
>> I was wondering if anyone else has had to do something similar to
>> this, or if there any approaches I'm overlooking. I spent a lot of
>> time implementing variations on Bret Slatkin's google i/o 2008 talk
>> about building scalable applications on app engine, specifically the
>> microblogging example. In the end, the introduction of a changing
>> friend network which impacts these histograms made any of my attempts
>> too costly to run.
>>
>> Any thoughts positive or negative would be welcome!
>>
>> Thanks
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Google App Engine" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected]<google-appengine%[email protected]>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/google-appengine?hl=en.
>>
>>
>
>
> --
> --
> Jeff
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to