Excellent points. Thanks for writing this up--I'll save it for reference when it comes time to tackle this issue.
Vince On Sun, Jun 7, 2009 at 4:08 PM, Baz <[email protected]> wrote: > Just to throw in my 2 cents, I think caching should be a central feature to > the datastore implementation. Unlike with a traditional environment where > costs are pre-paid during the initial capital outleigh, every ineffeciency > in GAE costs real money, > and depending on the level of inefficiency, can add up to orders of > magnitude more money. Take for example reading records. If you hit the > datastore everytime you needed a record, it could literally cost you > millions of times more in datastore fees if you had a lot of users. > Similarly for writing. If you sent a write request every time some data > changed you would be paying many times more than if you queued 100 requests > and sent them in one big batch. > > It would be irresponsible not to pre-optimize your datastore interactions > through a cache. > > There are serious performance implications as well. The datastore has an > awesome characteristic that allows it to process a query in the same > amount of time regardless of how many records the datastore contains - be it > 10 or 10 million. Execution costs only increase as the number of > *results*being returned increase. That kind of consistency on every request > is really > amazing, but it comes with a downside. The small queries from well indexed > relational db's that you're used to returning instantly - still take that > fixed amount of time with google. So for large recordsets the > datastoreoutperforms, but for smaller ones it won't be as snappy as you're > used to. A > caching layer would of course allow you to read from the datastore once, > then serve subsequent requests from the super-fast cache. > > Another important point is threading, or lack thereof. Google App Engine > does not support creating additional threads during a single request. So if > for a certain request you needed to run several long queries in parallel in > their own threads - you wouldn't be able to. You would have to run them > sequentially in GAE and most likely exceed your 3 second request limit. > Having a super-fast cache to read from mitigates this limitation. > > All this to say that caching with the datastore is much more important than > in a regular environment. So much so that I think it should be baked right > into OpenBD's datastore solution and turned on by default. The administrator > could even have settings to manage the cache like "max queue size for > batch write" and "max minutes between writes". To disable it you would > have to change the values of those settings from their defaults to zero. > > The caching layer could even be invisible to the application. The same > functions could be used like GoogleWrite() or GoogleRead(), but behind the > scenes OpenBD would intelligently broker requests between the application, > the cache and the datastore based on a few simple rules. The details of > when to cache, how to cache, synchronizing whats dirty and whats not, > queuing and sending batch requests, would all be hidden from the developer. > > In this case we are lucky that GAE and the datastore have the limitations > that they do. Since there are no joins or complex query functions or > groupings or variations in performance based on the nature of your data to > worry about, it seems like it would be feasible to develop a caching algorithm > that is optimal for any system. Similarly, there is only one viable caching > technology to choose from, one developed precisely for this usecase, and > one with an API - memcache. It would be the obvious choice of cache > technology regardless of how the cache is implemented. The logic and choices > behind implementing a caching layer on GAE is relatively straightforward > compared to a more open environment. This decreases the potential > innovations and customization that would be possible if it were left up to > the application to manage, and therefore reduces the value of implementing > it there. Why re-invent the wheel for each app? > > Baz > > >>> >> --~--~---------~--~----~------------~-------~--~----~ Open BlueDragon Public Mailing List http://groups.google.com/group/openbd?hl=en official site @ http://www.openbluedragon.org/ !! save a network - trim replies before posting !! -~----------~----~----~----~------~----~------~--~---
