Friday, June 5, 2009 10 things you (probably) didn't know about App
Engine<http://googleappengine.blogspot.com/2009/06/10-things-you-probably-didnt-know-about.html>

What could be better than nine nifty tips and tricks about App Engine? Why,
ten of course. As we've been participating in the discussion
groups<http://code.google.com/appengine/community.html>,
we've noticed that some features of App Engine often go unnoticed so we've
come up with just under eleven fun facts which might just change the way
that you develop your app. Without further ado, bring on the first tip:
1. App Versions are strings, not numbers

Although most of the examples show the 'version' field in app.yaml and
appengine-web.xml as a number, that's just a matter of convention. App
versions can be any string that's allowed in a URL. For example, you could
call your versions "live" and "dev", and they would be accessible at
"live.latest.*yourapp*.appspot.com" and "dev.latest.*yourapp*.appspot.com".
2. You can have multiple versions of your app running simultaneously

As we alluded to in point 1, App Engine permits you to deploy multiple
versions of your app and have them running side-by-side. All the versions
share the samedatastore and memcache, but they run in separate instances and
have different URLs. Your 'live' version always serves off
yourapp.appspot.com as well as any domains you have mapped, but all your
app's versions are accessible at version.latest.yourapp.appspot.com.
Multiple versions are particularly useful for testing a new release in a
production environment, on real data, before making it available to all your
users.

Something that's less known is that the different app versions don't even
have to have the same runtime! It's perfectly fine to have one version of an
app using the Java runtime and another version of the same app using the
Python runtime.
3. The Java runtime supports any language that compiles to Java bytecode

It's called the Java runtime, but in fact there's nothing stopping you from
writing your App Engine app in any other language that compiles to JVM
bytecode. In fact, there are already people writing App Engine apps in
JRuby, Groovy, Scala, Rhino (a JavaScript interpreter), Quercus (a PHP
interpreter/compiler), and even Jython! Our community has shared notes on
what they've found to work and not work on the following wiki
page<http://groups.google.com/group/google-appengine-java/web/will-it-play-in-app-engine?pli=1>
.
4. The 'IN' and '!=' operators generate multiple datastore queries 'under
the hood'

The 'IN' and '!=' operators in the Python runtime are actually implemented
in the SDK and translate to multiple queries 'under the hood'.

For example, the query "SELECT * FROM People WHERE name IN ('Bob', 'Jane')"
gets translated into two queries, equivalent to running "SELECT * FROM
People WHERE name = 'Bob'" and "SELECT * FROM People WHERE name = 'Jane'"
and merging the results. Combining multiple disjunctions multiplies the
number of queries needed, so the query "SELECT * FROM People WHERE name IN
('Bob', 'Jane') AND age != 25" generates a total of four queries, for each
of the possible conditions (age less than or greater than 25, and name is
'Bob' or 'Jane'), then merges them together into a single result set.

The upshot of this is that you should avoid using excessively large
disjunctions. If you're using an inequality query, for example, and you
expect only a small number of records to exactly match the condition (e.g.
in the above example, you know very few people will have an age of exactly
25), it may be more efficient to execute the query without the inequality
filter and exclude any returned records that don't match it yourself.
5. You can batch put, get and delete operations for efficiency

Every time you make a datastore request, such as a query or a get()
operation, your app has to send the request off to the datastore, which
processes the request and sends back a response. This request-response cycle
takes time, and if you're doing a lot of operations one after the other,
this can add up to a substantial delay in how long your users have to wait
to see a result.

Fortunately, there's an easy way to reduce the number of round trips: batch
operations. The db.put(), db.get(), and db.delete() functions all accept
lists in addition to their more usual singular invocation. When passed a
list, they perform the operation on all the items in the list in a
singledatastore round trip and they are executed in parallel, saving you a
lot of time. For example, take a look at this common pattern:

for entity in MyModel.all().filter("color =",
    old_favorite).fetch(100):
  entity.color = new_favorite
  entity.put()

Doing the update this way requires one datastore round trip for the query,
plus one additional round trip for each updated entity - for a total of up
to 101 round trips! In comparison, take a look at this example:

updated = []
for entity in MyModel.all().filter("color =",
    old_favorite).fetch(100):
  entity.color = new_favorite
  updated.append(entity)
db.put(updated)

By adding two lines, we've reduced the number of round trips required from
101 to just 2!
6. Datastore performance doesn't depend on how many entities you have

Many people ask about how the datastore will perform once they've inserted
100,000, or a million, or ten million entities. One of the datastore's major
strengths is that its performance is totally independent of the number of
entities your app has. So much so, in fact, that every entity for every App
Engine app is stored in a singleBigTable table! Further, when it comes to
queries, all the queries that you can execute natively (with the notable
exception of those involving 'IN' and '!=' operators - see above) have
equivalent execution cost: The cost of running a query is proportional to
the number of results returned by that query.
7. The time it takes to build an index isn't entirely dependent on its size

When adding a new index to your app on App Engine, it sometimes takes a
significant amount of time to build. People often inquire about this, citing
the amount of data they have compared to the time taken. However, requests
to build new indexes are actually added to a queue of indexes that need to
be built, and processed by a centralized system that builds indexes for all
App Engine apps. At peak times, there may be other index building jobs ahead
of yours in the queue, delaying when we can start building your index.
8. The value for 'Stored Data' is updated once a day

Once a day, we run a task to recalculate the 'Stored Data' figure for your
app based on your actual datastore usage at that time. In the intervening
period, we update the figure with an estimate of your usage so we can give
you immediate feedback on changes in your usage. This explains why many
people have observed that after deleting a large number of entities,
theirdatastore usage remains at previous levels for a while. For billing
purposes, only the authoritative number is used, naturally.
9. The order that handlers in app.yaml, web.xml, and appengine-web.xml are
specified in matters

One of the more common and subtle mistakes people make when configuring
their app is to forget that handlers in the application configuration files
are processed in order, from top to bottom. For example, when installing
remote_api, many people do the following:

handlers:
- url: /.*
  script: request.py

- url: /remote_api
  script: $PYTHON_LIB/google/appengine/ext/remote_api/handler.py
  login: admin

The above looks fine at first glance, but because handlers are processed in
order, the handler for request.py is encountered first, and all requests -
even those for remote_api - get handled by request.py. Since request.py
doesn't know about remote_api, it returns a 404 Not Found error. The
solution is simple: Make sure that the catchall handler comes after all
other handlers.

The same is true for the Java runtime, with the additional constraint that
all the static file handlers in appengine-web.xml are processed before any
of the dynamic handlers in web.xml.
10. You don't need to construct GQL strings by hand

One anti-pattern that comes up a lot looks similar to this:

q = db.GqlQuery("SELECT * FROM People "
    "WHERE first_name = '" + first_name
    + "' AND last_name = '" + last_name + "'")

As well as opening up your code to injection vulnerabilities (link), this
practice introduces escaping issues (what if a user has an apostrophe in
their name?) and potentially, encoding issues. Fortunately,GqlQuery has
built in support for parameter substitution, a common technique for avoiding
the need to substitute in strings in the first place. Using parameter
substitution, the above query can be rephrased like this:

q = db.GqlQuery("SELECT * FROM People "
    "WHERE first_name = :1 "
    "AND last_name = :2", first_name, last_name)

GqlQuery also supports using named instead of numbered parameters, and
passing a dictionary as an argument:

q = db.GqlQuery("SELECT * FROM People "
    "WHERE first_name = :first_name "
    "AND last_name = :last_name",
    first_name=first_name, last_name=last_name)

Aside from cleaning up your code, this also allows for some neat
optimizations. If you're going to execute the same query multiple times with
different values, you can useGqlQuery .bind() to 'rebind' the values of the
parameters for each query. This is faster than constructing a new query each
time, because the query only has to be parsed once:

q = db.GqlQuery("SELECT * FROM People "
    "WHERE first_name = :first_name "
    "AND last_name = :last_name")
for first, last in people:
  q.bind(first, last)
  person = q.get()
  print person

Posted by Nick Johnson, App Engine Team

*Java is a trademark or registered trademark of Sun Microsystems, Inc. in
the United States and other countries.*

--~--~---------~--~----~------------~-------~--~----~
Open BlueDragon Public Mailing List
 http://groups.google.com/group/openbd?hl=en
 official site @ http://www.openbluedragon.org/

!! save a network - trim replies before posting !!
-~----------~----~----~----~------~----~------~--~---

Reply via email to