The biggest conclusion I drew was that mapreduce was the main
cost in datastore reads. I have several different mapreduces
running over the same entities. I now combined all mapreduces
into one combined mapreduce which reduced cost quite a bit.
Additionally I created an Environment class that acts as an
instance cache and only "puts" entities back to the datastore
to avoid multiple puts during a single request (once you have
a huge system its sometimes hard to track which parts of the
system might already do put requests)...

Cheers,
-Andrin

On Sun, Jan 8, 2012 at 8:13 AM, jon <[email protected]> wrote:

> Andrin what conclusion did you draw from that side by side view?
>
> For us it's datastore that costs the most. I spent days rewriting our
> fanout implementation from a datastore-oriented one to a memcache-
> oriented alternative. Ironically our initial implementation was based
> on Brett Slatkin's talk.
>
> On Jan 6, 2:18 am, Andrin von Rechenberg <[email protected]> wrote:
> > Actually appstats gives me pretty much what I need, now that I looked at
> it
> > more carefully.
> > If you put the Billing History side by side with the appstats RPC stats
> > (that you can have per path)
> > you see exactly what paths your cost comes from (except CPU time).
> >
> > Cheers,
> > -Andrin
> >
> > On Tue, Jan 3, 2012 at 2:02 PM, Andrin von Rechenberg <
> [email protected]>wrote:
> >
> >
> >
> >
> >
> >
> >
> > > Hey there
> >
> > > First of all: It's great to have such an active community.
> >
> > > (@Kaan) My app is called MiuMeet. It's one of the leading location
> based
> > > Social/Dating Networks on mobile.
> >
> > > (@all) The dashboard gives me a nice idea about where the costs come
> from.
> > > I'd like to analyze Datastore read/writes more closely. All static
> content
> > > is cached forever externally (im using url cache busting)
> >
> > > (@yohan) It's run on python. It's a single app that generates this
> amount
> > > of traffic. I dont use thirdparty frameworks. Thanks for the pointer
> with
> > > the http cost return header. The problem is, since I do heavy caching,
> this
> > > header will only help me if the cache is cold. I can get an idea from
> this
> > > header but I would have to record a couple of thousand headers to get
> an
> > > idea. Is there no middleware for this like appstats? :)
> >
> > > (@jon) Why use sharded counters (maybe i've understood something
> wrong) ?
> > > I have built a pretty cool counter system for appengine that was in the
> > > Google AppEngine blog:
> >
> > >http://googleappengine.blogspot.com/2011/10/prodeagle-analyzing-your-.
> ..
> >
> > > (@Cayden) I use discounted hours heavily. I would also love to try
> > > Python2.7 but as Brandon points out one has some reasons to be
> hesitant.
> >
> > > (@Brandon) Your offer sounds interesting. I think you'd need a little
> more
> > > than 8h due to the size of the system - but maybe I underestimate the
> power
> > > of your mermaid costume :) I have built quite a few systems during my
> time
> > > as a Google Employee that are much bigger than MiuMeet and have a lot
> of
> > > ideas how to optimize MiuMeet. My main problem is that I need to
> figure out
> > > quickly how much cost I can save with which optimization. And therefor
> I'd
> > > like to measure better where the money is spent. When I see where it is
> > > spent I will probably have an idea how to optimize it. My problem is
> not
> > > the engineering challenge but the time to implement all optimizations.
> So
> > > I'd like to start with the low hanging fruits. But I will def think
> about
> > > your offer.
> >
> > > (@all) As I mentioned above: Given that there is a http return header
> that
> > > estimates the cost of a request, shouldn't it be quite straight
> forward to
> > > build a middleware like appstats that lists cost per request path? (I
> > > haven't looked at all at building middlewares)
> >
> > > Cheers and thanks to everyone for the replies
> > > -Andrin
> >
> > > On Tue, Jan 3, 2012 at 4:54 AM, Brandon Wirtz <[email protected]>
> wrote:
> >
> > >> Cayden,
> >
> > >> I'm a big fan of Python 2.7, but I wouldn't dream of telling someone
> > >> running
> > >> this large of an App to move to it right now.  I've seen what happens
> when
> > >> the scheduler gets things wrong, and that "increased latency" results
> in
> > >> "massive time outs".
> >
> > >> Python 2.7 is awesome if all of your requests are under 7 seconds.
> But our
> > >> experience has been that if you have more than 1 in 50 requests taking
> > >> more
> > >> than 7 seconds python 2.7 will buckle under load.
> >
> > >> I blame the scheduler, and that might not be the problem, but I know
> that
> > >> when you start having Long requests things start timing out, and it
> > >> appears
> > >> to be that the scheduler is willing to stack things poorly, under load
> > >> these
> > >> timeouts cascade.
> >
> > >> -Brandon
> >
> > >> -----Original Message-----
> > >> From: [email protected]
> > >> [mailto:[email protected]] On Behalf Of Cayden Meyer
> > >> Sent: Monday, January 02, 2012 7:10 PM
> > >> To: Google App Engine
> > >> Subject: [google-appengine] Re: Where to start optimizing cost?
> >
> > >> Hi Andrin,
> > >> The admin console can provide a great deal of information on where
> your
> > >> costs are coming from.
> > >> I know that you specified that you wanted ways to monitor rather than
> > >> solutions to reduce costs, however these are four fairly easy to
> measure
> > >> changes:
> > >> - If your application is not CPU bound you may wish to migrate to
> > >> Python2.7.
> > >> This can lower the number of instances required to serve the same
> amount
> > >> of
> > >> traffic. Changes can be seen by comparing the number of active
> instances
> > >> with python2.7 to serve traffic y vs activate instances with
> python2.5 to
> > >> serve traffic y. Note: Python 2.7 is currently experimental and the
> your
> > >> latency may increase when moving to Python2.7.
> > >> - Using memcache can reduce datastore operations.
> > >> - Use edge caching where possible, this can reduce the number of
> instances
> > >> required to serve traffic.
> >
> > >> - If you can roughly predict the amount of traffic you will receive,
> > >> discount instance hours are a good way to reduce costs.
> > >> Hope this helps you optimize your application. There are more ways to
> > >> optimize your application, however these are just a few simple ones
> which
> > >> can make a quite a difference.
> > >> Cayden MeyerProduct Manager, Google App Engine
> >
> > >> On Jan 1, 2:42 am, Andrin von Rechenberg <[email protected]> wrote:
> > >> > Hey there
> >
> > >> > I'm an absolute GAE-Lover! The only thing that is bothering me is
> cost.
> > >> > We spend about $10'000 a month in GAE cost. That's just too much for
> > >> > the traffic we serve. We are starting to look around for
> alternatives,
> > >> > but I'd really love to stay with GAE and just optimize the system.
> >
> > >> > One of GAE's biggest flaws IMHO is that it is really hard to see
> where
> > >> > the costs are coming from. (Yes we have appstats in place, but our
> > >> > system is just too big to manually sample hundreds of requests).
> >
> > >> > So here is a feature request that might help us optimize and stay
> with
> > >> GAE:
> > >> > *Show cost per URI in the dashboard. That would be incredibly
> > >> > helpful.*
> >
> > >> > Does anyone know of an existing elegant way to figure out where the
> > >> > cost come from? (I'm looking more for a way to measure rather than
> > >> > software solutions like "use memcache"). I could do something with
> > >> > prodeagle.com and estimate the cost per request myself but I'd
> rather
> > >> > use an already existing solution...
> >
> > >> > Cheers,
> > >> > -Andrin
> >
> > >> --
> > >> You received this message because you are subscribed to the Google
> Groups
> > >> "Google App Engine" group.
> > >> To post to this group, send email to
> [email protected].
> > >> To unsubscribe from this group, send email to
> > >> [email protected].
> > >> For more options, visit this group at
> > >>http://groups.google.com/group/google-appengine?hl=en.
> >
> > >> --
> > >> You received this message because you are subscribed to the Google
> Groups
> > >> "Google App Engine" group.
> > >> To post to this group, send email to
> [email protected].
> > >> To unsubscribe from this group, send email to
> > >> [email protected].
> > >> For more options, visit this group at
> > >>http://groups.google.com/group/google-appengine?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to