@alexander I agree that implementing a "few reads/lots of writes" app on GAE is not the easy/typical case. GAE with the current planned pricing is a perfect deal for small lots- reads/few writes sites (almost free) is a great deal for large (lots-reads/few writes) sites (better than ec2 ) and and is a not a such a good deal for a (few-reads/lots writes) app. On the other hand, running a beacon service (whether you are serving ads, or running stats) has significant scalability challenges anyway and I am willing to take on the challenge to implement it on GAE to leverage its scalability even if that means that the there will not be much "profit margin" left.
@dave I have followed exactly the architecture you suggest internally : 3 web services: The first I call "recorder: (your capturer) - just captures the beacon hit The second I call the "processor" (your "store") it updates the structures and incurs most of the CPU cost (in your suggestion you spread some of that CPU load to the reporter and that could be a promising alternative) The third, the reporter, which is really a Google Data Source implementation fetches from data store the precalculated chart data and drives google chart based reports I do the break up to 1) improve batching 2) make beacon hits fast 3) make the reporting fast Doing that breakup as fully independent apps doesn't actually change the cost per se (except if you take into account the free first 5M hits). @geoffrey > Relying on memcache as reliable storage even temporarily is almost > certainly a bad idea. Geoffrey, I am taking a gamble with this: GAE currently misses the concept of a file system "logfile" : a very efficient (think cheap) append-only buffered file - that sequences chronologically all writes received from web servers. Thats what they use internally to implement the apache log facility. The only way to simulate this is via memcached. So I have implemented the equivalent of a "Buffered Append Only LOGFILE", which accumulates writes in mem, and every 100 lines or 1minute (which ever happens first) flushes to the disk. Remember that OS-based logfiles also do not guarantee persistence until the "flush" actually happened. If it "behaves" as expected I will suggest it to the gaeutilities guys. My hope is that a frequently accessed memcached item doesn't disappear in 30-60 secs except in rare cases - and log files are ok with that - they are not truly transactional storage. diomedes On Feb 17, 9:25 am, Dave Warnock <[email protected]> wrote: > Geoffrey, > > >> - a report application that grabs the data via api calls to a data > >> store app, does processing and shoves it in memcache. > > > Relying on memcache as reliable storage even temporarily is almost > > certainly a bad idea. > > Agreed. But a report server would not be doing so. If the data is not > available via memcache it would grab it via api calls to the data > store app. Ok slower then than direct bigtable access but just the > same principal (except maybe you add a bit more processing between > bigtable and memcache to get the data ready for the report). > > Dave-- > Dave Warnock:http://42.blogs.warnock.me.uk --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
