@alexander
I agree that implementing a "few reads/lots of writes" app on GAE is
not the easy/typical case.
GAE with the current planned pricing is a perfect deal for small lots-
reads/few writes sites (almost free)
is a great deal for large (lots-reads/few writes) sites (better than
ec2 )
and and is a not a such a good deal for a (few-reads/lots writes) app.
On the other hand, running a beacon service (whether you are serving
ads, or running stats) has significant scalability challenges anyway
and I am willing to take on the challenge to implement it on GAE to
leverage its scalability even if that means that the there will not be
much "profit margin" left.

@dave
I have followed exactly the architecture you suggest internally : 3
web services:
The first I call "recorder: (your capturer) - just captures the beacon
hit
The second I call the "processor" (your "store") it updates the
structures and incurs most of the CPU cost  (in your suggestion you
spread some of that CPU load to the reporter and that could be a
promising alternative)
The third, the reporter, which is really a Google Data Source
implementation  fetches from data store the precalculated chart data
and drives google chart based reports
I do the break up to 1) improve batching 2) make beacon hits fast  3)
make the reporting fast
Doing that breakup as fully independent apps doesn't actually change
the cost per se  (except if you take into account the free first 5M
hits).

@geoffrey
> Relying on memcache as reliable storage even temporarily is almost
> certainly a bad idea.
Geoffrey, I am taking a gamble with this:
GAE currently misses the concept of a file system "logfile" : a very
efficient (think cheap) append-only buffered file - that sequences
chronologically all writes received from web servers. Thats what they
use internally to implement the apache log facility.
The only way to simulate this is via memcached.
So I have implemented the equivalent of a "Buffered Append Only
LOGFILE", which accumulates writes in mem,  and every 100 lines or
1minute (which ever happens first) flushes to the disk.
Remember that OS-based logfiles also do not guarantee persistence
until the "flush" actually happened.
If it "behaves" as expected I will suggest it to the gaeutilities
guys. My hope is that a frequently accessed memcached item doesn't
disappear in 30-60 secs except in rare cases - and log files are ok
with that - they are not truly transactional storage.

diomedes

On Feb 17, 9:25 am, Dave Warnock <[email protected]> wrote:
> Geoffrey,
>
> >> - a report application that grabs the data via api calls to a data
> >> store app, does processing and shoves it in memcache.
>
> > Relying on memcache as reliable storage even temporarily is almost
> > certainly a bad idea.
>
> Agreed. But a report server would not be doing so. If the data is not
> available via memcache it would grab it via api calls to the data
> store app. Ok slower then than direct bigtable access but just the
> same principal (except maybe you add a bit more processing between
> bigtable and memcache to get the data ready for the report).
>
> Dave--
> Dave Warnock:http://42.blogs.warnock.me.uk
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to