[google-appengine] Re: CPU time for trivial datastore roundtrip

Anthony Mills Mon, 13 Jul 2009 06:27:59 -0700
50.

On Jul 13, 3:30 am, "Nick Johnson (Google)" <[email protected]>
wrote:
> On Sun, Jul 12, 2009 at 1:39 AM, Anthony Mills<[email protected]> wrote:
>
> > Nick, thanks for responding. I'll add a lot more detail and test data
> > here.
>
> > I'm doing three fetches: one to get the user data, one to get his
> > starred items, one to get his own items. I've broken it down to
> > examine each one individually. The user fetch is fast:
>
> > userEntity = datastore.get(KeyFactory.createKey("User", makeUserKeyName
> > (getCurrentUserId())));
>
> > /service?a=g&t=u 200 56ms 27cpu_ms 12api_cpu_ms 0kb
> > /service?a=g&t=u 200 35ms 21cpu_ms 12api_cpu_ms 0kb
> > /service?a=g&t=u 200 26ms 15cpu_ms 12api_cpu_ms 0kb
> > /service?a=g&t=u 200 49ms 15cpu_ms 12api_cpu_ms 0kb
> > /service?a=g&t=u 200 37ms 15cpu_ms 12api_cpu_ms 0kb
> > /service?a=g&t=u 200 27ms 15cpu_ms 12api_cpu_ms 0kb
>
> > The saved query is not so fast. Each item has a list property "s"
> > which is a list of all user IDs that have saved that item. It's served
> > by a dedicated index, s asc + i asc.
>
> > Query query = new Query("Item");
> > query.addFilter("s", Query.FilterOperator.EQUAL, getCurrentUserId());
> > query.addSort("i");
> > List<Entity> items = datastore.prepare(query).asList(withLimit
> > (REQUEST_SIZE));
>
> How big is REQUEST_SIZE? That's going to be an important factor in how
> expensive the query is.
>
>
>
>
>
> > /service?a=g&t=ts 200 97ms 710cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=ts 200 79ms 711cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=ts 200 84ms 705cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=ts 200 87ms 716cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=ts 200 84ms 696cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=ts 200 72ms 697cpu_ms 678api_cpu_ms 0kb
>
> > The query to get the user's items is similarly slow. It is a query on
> > a string property "u", ordered by "i". It's served by a dedicated
> > index, u asc + i asc.
>
> > Query query = new Query("Item");
> > query.addFilter("u", Query.FilterOperator.EQUAL, getCurrentUserId());
> > query.addSort("i");
> > List<Entity> items = datastore.prepare(query).asList(withLimit
> > (REQUEST_SIZE));
>
> > /service?a=g&t=tm 200 71ms 699cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=tm 200 90ms 710cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=tm 200 73ms 699cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=tm 200 67ms 692cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=tm 200 83ms 698cpu_ms 678api_cpu_ms 0kb
> > /service?a=g&t=tm 200 80ms 722cpu_ms 678api_cpu_ms 0kb
>
> > Adding the sort does not seem to add any extra time, and nor should
> > it, given the index type.
>
> > Anyway, put the three together and this is what you get:
>
> > /service?a=g 200 929ms 1731cpu_ms 1368api_cpu_ms 24kb
> > /service?a=g 200 159ms 1423cpu_ms 1368api_cpu_ms 24kb
> > /service?a=g 200 171ms 1420cpu_ms 1368api_cpu_ms 24kb
> > /service?a=g 200 178ms 1414cpu_ms 1368api_cpu_ms 24kb
> > /service?a=g 200 146ms 1405cpu_ms 1368api_cpu_ms 24kb
>
> > See, what really annoys me is this. Those requests are coming back in
> > 170 ms. But I'm charged for 1400 ms of CPU time. How does that work?
> > It should be doing blocking work on the server, right? Are there
> > really nine servers all churning for that whole time?
>
> > Thanks for your time,
>
> > Anthony Mills
>
> > On Jul 10, 11:07 am, Alex Popescu
> > <[email protected]> wrote:
> >> Here's some sample code (not exactly the one I've previously mentioned
> >> as I haven't got the time to prepare it):
>
> >> [code]
> >> query = ContentEvent.gql('ORDER BY created_at DESC')
> >> event_list = query.fetch(max_results, offset)
> >> # now that we've fetched the initial list we need to navigate all the
> >> references
> >> # to make sure that there will be no further DB hits
> >> keys = []
> >> for event in event_list:
> >>   keys.append(event.get_refkey('entry'))
> >> entries = db.get(keys)
> >> for index, event in enumerate(event_list):
> >>   event.entry = entries[index]
> >> [/code]
>
> >> This code gets called with a max_results=20 and various offsets
> >> (pagination).
> >> Now the results:
>
> >> 07-10 09:51AM 01.720 /_m/cs?p=6 200 173ms 690cpu_ms 522api_cpu_ms 4kb
> >> 07-10 09:50AM 57.408 /_m/cs?p=5 200 203ms 690cpu_ms 522api_cpu_ms 4kb
> >> 07-10 09:50AM 52.963 /_m/cs?p=4 200 261ms 717cpu_ms 522api_cpu_ms 5kb
> >> 07-10 09:50AM 48.888 /_m/cs?p=3 200 164ms 745cpu_ms 522api_cpu_ms 5kb
> >> 07-10 09:50AM 42.068 /_m/cs?p=2 200 1563ms 2034cpu_ms 522api_cpu_ms
> >> 6kb
> >> 07-10 09:46AM 40.239 /_m/cs?p=1 200 168ms 745cpu_ms 522api_cpu_ms 5kb
> >> 07-10 09:46AM 30.222 /_m/cs?p=2 200 1422ms 2089cpu_ms 522api_cpu_ms
> >> 6kb
> >> 07-10 09:45AM 20.582 /_m/cs 200 2718ms 2145cpu_ms 522api_cpu_ms 5kb
>
> >> As you can see the initial requests are above 2000cpu_ms. Then it
> >> looks like there is some pre-fetching kicking in that pushes this time
> >> down to 690-750 cpu_ms.
>
> >> To add a bit more detail to the above code: there are 2 entities one
> >> of them keeping a ReferenceProperty to the other. Still, I am making
> >> sure there are only 2 hits to the datastore by fetching all child
> >> entities in a single get (basically there are 2 hits to the datastore
> >> each returning 20 results).
>
> >> The cpu_ms range is quite huge though: 690-750cpu_ms to 2000+cpu_ms.
>
> >> I hope to get some sample code for the initial scenario and report
> >> back.
>
> >> ./alex
> >> --
> >> .w( the_mindstorm )p.
> >>   Alexandru Popescu
>
> >> On Jul 10, 7:17 pm, "Nick Johnson (Google)" <[email protected]>
> >> wrote:
>
> >> > Here's the relevant ops that one of these pages 
> >> > (ex:http://www.nmaps.net/172547) does:
>
> >> > self.user = model.User.get(db.Key(self.session['user']))
>
> >> > map = model.Map.get_by_map_id(int(map_id))
>
> >> > q = model.Comment.all().ancestor(map).order("-lastupdated")
> >> > comments = q.fetch(COMMENTS_PER_PAGE + 1)
>
> >> > q = model.Map.all()
> >> > q.filter("user =", map.user)
> >> > q.filter("unlisted =", False)
> >> > q.filter("float_num <", map.float_num)
> >> > q.order("-float_num")
> >> > other_maps = q.fetch(6)
>
> >> > As you can see, typically two gets for a single entity, along with a
> >> > couple of queries - one for up to 40 entities (COMMENTS_PER_PAGE=40),
> >> > one for 6.
>
> >> > -Nick Johnson
>
> >> > On Fri, Jul 10, 2009 at 5:09 PM, Alex
>
> >> > Popescu<[email protected]> wrote:
>
> >> > > Hi Nick,
>
> >> > > I'll deploy some tests to further investigate this issue and I'll
> >> > > provide the code and results.
>
> >> > > Meanwhile, I'm wondering if you could share the relevant pieces of
> >> > > code in your app that are tipically
> >> > > below 600ms. I must confess that I haven't seen many such cases in my
> >> > > app so far
> >> > > (except some on which I'm doing extremely aggressive caching).
>
> >> > > ./alex
> >> > > --
> >> > > .w( the_mindstorm )p.
> >> > >  Alexandru Popescu
>
> >> > > On Jul 10, 2:46 pm, "Nick Johnson (Google)" <[email protected]>
> >> > > wrote:
> >> > >> Hi Alex,
>
> >> > >> It's impossible to give a useful comment without first seeing the code
> >> > >> that you're using. Speaking from my own experience, I have a personal
> >> > >> site that serves reasonably datastore-intensive pages, and typically
> >> > >> total CPU milliseconds doesn't exceed about 600 - that's to do several
> >> > >> get requests and at least one query. So it's certainly possible to do
> >> > >> useful work with the datastore and come in well below the 1300 to
> >> > >> 1400ms you quote.
>
> >> > >> -Nick Johnson
>
> >> > >> On Fri, Jul 10, 2009 at 1:27 AM, Alex
>
> >> > >> Popescu<[email protected]> wrote:
>
> >> > >> > Hi guys,
>
> >> > >> > I have a cron op that is fetching some records for an entity and 
> >> > >> > does
> >> > >> > some processing on them.
>
> >> > >> > While checking my app logs I have noticed that for the trivial case
> >> > >> > when no result is returned (and so there is no additional processing
> >> > >> > done) this operation is billed constantly with something between
> >> > >> > 1300ms and 1400ms.
>
> >> > >> > This basically tells me that this is the initial cost of a roundtrip
> >> > >> > to the datastore (plus basic request dispatching; note: the response
> >> > >> > doesn't have any data).
>
> >> > >> > Based on this I can further deduce that on a daily basis the free 
> >> > >> > CPU
> >> > >> > quota will allow me to run around 18k datastore roundtrips with 0
> >> > >> > results.
>
> >> > >> > How do you comment on this data when compared with the public
> >> > >> > announcements you've done in the past about the amount of requests 
> >> > >> > an
> >> > >> > app can serve based only on free quota?
>
> >> > >> > ./alex
> >> > >> > --
> >> > >> > .w( the_mindstorm )p.
> >> > >> >  Alexandru Popescu
>
> >> > >> > PS: I have argued (and I continue to believe) that billing for CPU
> >> > >> > time is completely incorrect to app engine customers and that Google
> >> > >> > should look into some alternative options:
> >> > >> >http://themindstorms.blogspot.com/2009/06/open-letter-to-google-app-e...
>
> >> > >> --
> >> > >> Nick Johnson, App Engine Developer Programs Engineer
> >> > >> Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> >> > >> Number: 368047
>
> >> > --
> >> > Nick Johnson, App Engine Developer Programs Engineer
> >> > Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> >> > Number: 368047
>
> --
> Nick Johnson, App Engine Developer Programs Engineer
> Google Ireland Ltd. :: Registered in Dublin, Ireland, Registration
> Number: 368047
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---
[google-appengine] Re: CPU time for trivial datastore roundtrip

Reply via email to