great questions! i can tackle at least some of these. as a disclaimer, everything i say here refers to the production environment. the SDK behaves differently, and as discussed in a few other threads, is inefficient in a number of ways.
On Sep 11, 12:43 pm, Bill <[EMAIL PROTECTED]> wrote: > 1) Will timeout issues on put/transactions be removed when we go pay- > as-you-go or should we develop production apps with these limits in > mind? datastore timeouts and request deadlines will still exist after we've launched billing, so yes, you'll want to develop with them in mind. > Exact # of puts or transactions you can reasonably expect to > work within one request before quota issue. exact numbers will always depend on the size and shape of your data. having said that, you should be able to put or delete a large number of entities, e.g. in the hundreds or more, if you pass multiple entities or keys in a single put() or delete() call: http://code.google.com/appengine/docs/datastore/functions.html you may also be able to write or delete more entities if the ratio of entities to entity groups in the put() or delete() call is high. > 3) Best practices for (de)normalization and entity sizes. A gut > reaction some developers might take when approaching datastore is to > denormalize and put stuff in fewer tables. What are the costs of > keeping many small entities and using reference properties instead? storing and querying on reference properties doesn't cost any more than storing and querying on non-reference properties. the one reference property feature that incurs extra cost is the automatic dereferencing: http://code.google.com/appengine/docs/datastore/typesandpropertyclasses.html#ReferenceProperty > For example, in a many-to-many relationship, we could have 3 Kinds: A, > B, and join(A,B). This is just like a traditional relational DB with > a join model. What are the costs of traversing implicit collection > sets defined by the reference properties in the join Kind? If you > have a limited relationship between two entities, when does using a > ListProperty (of keys, for example) make sense, especially in light of > the cap on indexed properties per entity? you almost always want to model one-to-many relationships with reference properties. similarly, you almost always want to model many- to-many relationships with a list reference property, ie ListProperty(db.Key). with these, "related to X" queries won't cost any more than any other query. using a "join" kind, on the other hand, incurs additional fetches for each of the result entities on top of the join kind query. the main use case for join kinds is when you want to impose additional criteria on the join at runtime. rafe kaplan's google i/o talk describes these techniques in detail: http://sites.google.com/site/io/working-with-google-app-engine-models > 4) Benchmarks! I've been meaning to run tests on costs for different > datastore operations: > - Direct get using key or id > - Direct get using list of key/id > - Fetches using filters > - Iterative get from a query > - How the above 3 (direct w/ key, bulk fetch, iterative get) scale > with request size. > - Delete/Put > - The big hit using transactions like always, these will depend noticeably on the size and shape of your data. i can give a few rules of thumb, though. direct gets by key will usually be the fastest operation. single- property queries, ie queries with a single filter or sort order, should generally be fast. queries that use a user-defined index should generally be fast. queries with equals filters on multiple properties that use the built- in indexes have extra amount of overhead, which is roughly a fixed cost per query result. the overhead will depend on (you can probably guess what's next) the size and shape of your data. if these queries aren't as fast as you'd like in your app, adding dedicated index(es) will speed them up. finally, transactions shouldn't add a prohibitive amount of overhead. in many cases, doing a number of writes in a transaction can actually be (a little) faster than doing them outside of a transaction. are you seeing a noticeable slowdown with transactions, compared to without? > Is it a big win to come up with a good key naming scheme or does that > bite you in other ways? you mean, providing a key_name instead of having the datastore allocate an id? http://code.google.com/appengine/docs/datastore/keysandentitygroups.html#Kinds_Names_and_IDs performance should be the same with key_name vs. id. the main difference is that key_name allows a (limited) form of querying without actually querying. for example, say you're writing a wiki, and you put the page name in key_name. when a request for a page comes in, you can construct a key with that key_name in memory and get() it directly, as opposed to querying with an equals filter. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
