Hi all, i've been slow on this partially because i'm in the middle of some giant re-factorings to optimize for the new pricing model on GAE. here are some thoughts:
- memcache is your friend. if you are querying for something on a request, toss it in memcache (you have to cast rows to dicts first). i have some lookup type tables that i keep in memcache - make your schema as flat a possible. joins don't exist so that is additional queries. also remember that things like row.ref.name runs a query behind the scenes in web2py - if you have more than a few tables, put table definitions in module files (perhaps as classes) then import and init them when needed. in one of my projects with about 50 tables this is going to save me about 30ms per request (and at 2.5 million requests per day that turns into instance hours....) - be wary of validators. know that IS_IN_DB() and IS_NOT_IN_DB() will perform queries. IS_IN_DB() will cost you 1 read per row in the table that goes into the drop-down in your form. consider using IS_IN_SET() with a set built from a memcached copy of the data - i hear that python 2.7 is supposed to make a difference. i have not pulled the trigger on any of my production apps yet. i'm afraid to turn on multithreading because of this issue: http://code.google.com/p/googleappengine/issues/detail?id=6323 but i don't know if that will effect my app or not. - i'm planning to start using the cache-control headers for staticish pages (i have lots of pages that can be cached for at least 5 minutes). it is reported that google will then serve your page from its cache for free! - i've started to run task queue tasks on backend instances. this allows me to throttle them better and catch when i have written buggy code that goes out of control (yes, there was a $1000 bill for a day when that happened.) - i have been running cron jobs to aggregate data for reporting purposes since complex queries cannot be run on the fly. i'm starting to experiment with google cloud sql and putting reporting data in a SQL datastore so that site admins can run what they want when they want. those are my major thoughts for now. please let me know if you have questions about these techniques. cfh

