Well.. for one.. you are doing a datastore.query() instead of a db.query()

Most all documentation on working with the datastore indicates to use db
from google.appengine.ext instead of datastore from google.appengine.api.

Maybe there is a difference in how they perform in this context?

Also, are you doing these tests on Appengine or in the Dev_appserver? (I'm
presuming you're doing them on appengine live.. but just to be sure).

On Mon, Oct 11, 2010 at 9:53 AM, Darshan Shaligram <[email protected]>wrote:

> This is a followup query to my question on stackoverflow:
>
> http://stackoverflow.com/questions/3886341/is-appengine-python-datastore-query-much-3x-slower-than-java
>
> I've been evaluating the appengine to choose between Python and Java
> and I noticed a large performance difference in datastore queries:
> large queries are much slower in Python (by a factor of >3x) than in
> Java. I'd like to confirm that this performance difference is known
> behaviour, and not some mistake I'm making in my Python code.
>
> My test entity looks like this:
>
> Person
> ======
> firstname (length 8)
> lastname (length 8)
> address (20)
> city (10)
> state (2)
> zip (5)
>
> I populate the datastore with 2000 Person records, with each field
> exactly the length noted here, all filled with random data and with no
> fields indexed (just so the inserts go faster).
>
> I then query 1k Person records from Python (no filters, no ordering):
>
>    q = datastore.Query("Person")
>    objects = list(q.Get(1000))
>
> And 1k Person records from Java (likewise no filters, no ordering):
>
>    DatastoreService ds =
> DatastoreServiceFactory.getDatastoreService();
>    Query q = new Query("Person");
>    PreparedQuery pq = ds.prepare(q);
>    // Force the query to run and return objects so we can be sure
>    // we've timed a full query.
>    List<Entity> entityList = new
> ArrayList<Entity>(pq.asList(withLimit(1000)));
>
> With this code, the Java code returns results in ~200ms; the Python
> code takes much longer, averaging >700ms. Both apps are on the same
> app id (with different versions), so they use the same datastore and
> should be on a level playing field.
>
> I repeated the same test with much smaller fetches (fetch size 10-30)
> and the small fetches show essentially the same performance for both
> Python and Java, so the Python slowness affects only large fetches.
>
>
> All my code is available here, in case I've missed any details:
> http://github.com/greensnark/appenginedatastoretest
>
>
> I also instrumented the sample apps with appstats (as suggested on
> stackoverflow), and reran the tests (1k record fetch).  Appstats
> reports times like this "datastore_v3.RunQuery real=122ms api=9179ms"
> for Java and times like "datastore_v3.RunQuery real=377ms api=9179ms"
> for Python. I'm not entirely clear on how to read the appstats times.
>
> From my examination of the Python code in
> google.appengine.api.datastore, it looks like most of the extra
> slowdown in the Python code involves decoding the queried entities
> from their protocol buffers, but I haven't benchmarked this to be
> sure.
>
> Could anyone confirm if large datastore queries are just slower in
> Python because Python is intrinsically slower than Java, or that my
> code is broken in some way that's screwing with the performance in the
> Python version?
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to