Hey Dan, If your getting large bursts of requests there is a very good chance they will not hit the app / datastore in order they were made anyway. At least if you define order-they-were-made as the 'true' click order (hope that is clear). There will be network latencies, instance spin-ups, and various other delays / momentary hangups outside your control.
I think your idea sounds like a good solution. You'll be able to increment the memcache counter quite rapidly, I've run batch jobs with global memcache counters at the scale you're talking about with no real issue (although I've got very close to the daily memcache api call limit ;). In my experience, it is unlikely memcache would be flushed multiple times within the space of a few minutes. I sometimes see an active item get flushed, but it is rare. I would order by the counter unless you've got reason to believe the cache was flushed (if you get more than one one, it was flushed :)). Robert On Mon, Feb 21, 2011 at 10:04, Dan Dubois <[email protected]> wrote: > Thanks for the replies. > Robert, > The rates come in bursts peaking over a few seconds and tail off quickly > over the course of a few minutes. They are generated by users quickly > clicking links in web browsers that use AJAX to connect to the GAE cloud. I > would like to get the sequential order of these clicks over all users. > > > I had a thought, I could have a 'Request' entity as so: > class Request(db.Model): > request_id = db.StringProperty() > timestamp = db.DateTimeProperty(auto_now_add=True) > count = db.IntegerProperty(long) > > The timestamp property is taken from the instances clock and count property > is updated with memcache atomic increment for each new request. I could then > use a task to select the Request entities ordered by timestamp and in memory > also order by count. The count property takes precedence over the timestamp. > The worst case scenario, where memcache is flushed before each new request, > would degenerate to a resolution of the clock differences between each GAE > instance. > I don't expect I would hit the monotonically increasing value performance > limitation as Ikai explained. However all I would need to do is not index > timestamp and count properties and reindex them at a later date. Then the > limitation would be how quickly can I call memcache increment. I don't > expect this would be an issue either... > In the general case, I think this method would provide the greatest accuracy > and tolerate the occasional memcache flush. Relative to the latency > variations inherent in the web, this could be a winner. If I needed to be > even more specific, I could use Calvin's idea to make sure each GAE > instance's clock isn't minutes different from other instances to minimise > the disruption when memcache is flushed. > What do you think? > > Best wishes, > Dan > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
