Excellent points. Thanks for writing this up--I'll save it for reference
when it comes time to tackle this issue.

Vince
On Sun, Jun 7, 2009 at 4:08 PM, Baz <[email protected]> wrote:

> Just to throw in my 2 cents, I think caching should be a central feature to
> the datastore implementation. Unlike with a traditional environment where
> costs are pre-paid during the initial capital outleigh, every ineffeciency
> in GAE costs real money,
> and depending on the level of inefficiency, can add up to orders of
> magnitude more money. Take for example reading records. If you hit the
> datastore everytime you needed a record, it could literally cost you
> millions of times more in datastore fees if you had a lot of users.
> Similarly for writing. If you sent a write request every time some data
> changed you would be paying many times more than if you queued 100 requests
> and sent them in one big batch.
>
> It would be irresponsible not to pre-optimize your datastore interactions
> through a cache.
>
> There are serious performance implications as well. The datastore has an
> awesome characteristic that allows it to process a query in the same
> amount of time regardless of how many records the datastore contains - be it
> 10 or 10 million. Execution costs only increase as the number of 
> *results*being returned increase. That kind of consistency on every request 
> is really
> amazing, but it comes with a downside. The small queries from well indexed
> relational db's that you're used to returning instantly - still take that
> fixed amount of time with google. So for large recordsets the 
> datastoreoutperforms, but for smaller ones it won't be as snappy as you're 
> used to. A
> caching layer would of course allow you to read from the datastore once,
> then serve subsequent requests from the super-fast cache.
>
> Another important point is threading, or lack thereof. Google App Engine
> does not support creating additional threads during a single request. So if
> for a certain request you needed to run several long queries in parallel in
> their own threads - you wouldn't be able to. You would have to run them
> sequentially in GAE and most likely exceed your 3 second request limit.
> Having a super-fast cache to read from mitigates this limitation.
>
> All this to say that caching with the datastore is much more important than
> in a regular environment. So much so that I think it should be baked right
> into OpenBD's datastore solution and turned on by default. The administrator
> could even have settings to manage the cache like "max queue size for
> batch write" and "max minutes between writes". To disable it you would
> have to change the values of those settings from their defaults to zero.
>
> The caching layer could even be invisible to the application. The same
> functions could be used like GoogleWrite() or GoogleRead(), but behind the
> scenes OpenBD would intelligently broker requests between the application,
> the cache and the datastore based on a few simple rules. The details of
> when to cache, how to cache, synchronizing whats dirty and whats not,
> queuing and sending batch requests, would all be hidden from the developer.
>
> In this case we are lucky that GAE and the datastore have the limitations
> that they do. Since there are no joins or complex query functions or
> groupings or variations in performance based on the nature of your data to
> worry about, it seems like it would be feasible to develop a caching algorithm
> that is optimal for any system. Similarly, there is only one viable caching
> technology to choose from, one developed precisely for this usecase, and
> one with an API - memcache. It would be the obvious choice of cache
> technology regardless of how the cache is implemented. The logic and choices
> behind implementing a caching layer on GAE is relatively straightforward
> compared to a more open environment. This decreases the potential
> innovations and customization that would be possible if it were left up to
> the application to manage, and therefore reduces the value of implementing
> it there. Why re-invent the wheel for each app?
>
> Baz
>
>
>>>
>>

--~--~---------~--~----~------------~-------~--~----~
Open BlueDragon Public Mailing List
 http://groups.google.com/group/openbd?hl=en
 official site @ http://www.openbluedragon.org/

!! save a network - trim replies before posting !!
-~----------~----~----~----~------~----~------~--~---

Reply via email to