>From glancing at the two (Get() from datastore.query and fetch() from db.query):
http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/datastore.py#1199 http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/db/__init__.py#1721 It seems like doing query.fetch(limit) from db just wraps query.Get(limit) from datastore.. but never hurts to do it the kosher way.. in case something else is happening under the hood on live appengine. Also, maybe your Stopwatch() does something to slow it down? try using appstats on it without the Stopwatch() (Though, I can't imagine how that would result in it running several hundred milliseconds slower.) <http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/api/datastore.py#1199> On Mon, Oct 11, 2010 at 5:18 PM, Eli Jones <[email protected]> wrote: > Well.. for one.. you are doing a datastore.query() instead of a db.query() > > Most all documentation on working with the datastore indicates to use db > from google.appengine.ext instead of datastore from google.appengine.api. > > Maybe there is a difference in how they perform in this context? > > Also, are you doing these tests on Appengine or in the Dev_appserver? (I'm > presuming you're doing them on appengine live.. but just to be sure). > > On Mon, Oct 11, 2010 at 9:53 AM, Darshan Shaligram <[email protected]>wrote: > >> This is a followup query to my question on stackoverflow: >> >> http://stackoverflow.com/questions/3886341/is-appengine-python-datastore-query-much-3x-slower-than-java >> >> I've been evaluating the appengine to choose between Python and Java >> and I noticed a large performance difference in datastore queries: >> large queries are much slower in Python (by a factor of >3x) than in >> Java. I'd like to confirm that this performance difference is known >> behaviour, and not some mistake I'm making in my Python code. >> >> My test entity looks like this: >> >> Person >> ====== >> firstname (length 8) >> lastname (length 8) >> address (20) >> city (10) >> state (2) >> zip (5) >> >> I populate the datastore with 2000 Person records, with each field >> exactly the length noted here, all filled with random data and with no >> fields indexed (just so the inserts go faster). >> >> I then query 1k Person records from Python (no filters, no ordering): >> >> q = datastore.Query("Person") >> objects = list(q.Get(1000)) >> >> And 1k Person records from Java (likewise no filters, no ordering): >> >> DatastoreService ds = >> DatastoreServiceFactory.getDatastoreService(); >> Query q = new Query("Person"); >> PreparedQuery pq = ds.prepare(q); >> // Force the query to run and return objects so we can be sure >> // we've timed a full query. >> List<Entity> entityList = new >> ArrayList<Entity>(pq.asList(withLimit(1000))); >> >> With this code, the Java code returns results in ~200ms; the Python >> code takes much longer, averaging >700ms. Both apps are on the same >> app id (with different versions), so they use the same datastore and >> should be on a level playing field. >> >> I repeated the same test with much smaller fetches (fetch size 10-30) >> and the small fetches show essentially the same performance for both >> Python and Java, so the Python slowness affects only large fetches. >> >> >> All my code is available here, in case I've missed any details: >> http://github.com/greensnark/appenginedatastoretest >> >> >> I also instrumented the sample apps with appstats (as suggested on >> stackoverflow), and reran the tests (1k record fetch). Appstats >> reports times like this "datastore_v3.RunQuery real=122ms api=9179ms" >> for Java and times like "datastore_v3.RunQuery real=377ms api=9179ms" >> for Python. I'm not entirely clear on how to read the appstats times. >> >> From my examination of the Python code in >> google.appengine.api.datastore, it looks like most of the extra >> slowdown in the Python code involves decoding the queried entities >> from their protocol buffers, but I haven't benchmarked this to be >> sure. >> >> Could anyone confirm if large datastore queries are just slower in >> Python because Python is intrinsically slower than Java, or that my >> code is broken in some way that's screwing with the performance in the >> Python version? >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Google App Engine" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]<google-appengine%[email protected]> >> . >> For more options, visit this group at >> http://groups.google.com/group/google-appengine?hl=en. >> >> > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
