The biggest conclusion I drew was that mapreduce was the main cost in datastore reads. I have several different mapreduces running over the same entities. I now combined all mapreduces into one combined mapreduce which reduced cost quite a bit. Additionally I created an Environment class that acts as an instance cache and only "puts" entities back to the datastore to avoid multiple puts during a single request (once you have a huge system its sometimes hard to track which parts of the system might already do put requests)...
Cheers, -Andrin On Sun, Jan 8, 2012 at 8:13 AM, jon <[email protected]> wrote: > Andrin what conclusion did you draw from that side by side view? > > For us it's datastore that costs the most. I spent days rewriting our > fanout implementation from a datastore-oriented one to a memcache- > oriented alternative. Ironically our initial implementation was based > on Brett Slatkin's talk. > > On Jan 6, 2:18 am, Andrin von Rechenberg <[email protected]> wrote: > > Actually appstats gives me pretty much what I need, now that I looked at > it > > more carefully. > > If you put the Billing History side by side with the appstats RPC stats > > (that you can have per path) > > you see exactly what paths your cost comes from (except CPU time). > > > > Cheers, > > -Andrin > > > > On Tue, Jan 3, 2012 at 2:02 PM, Andrin von Rechenberg < > [email protected]>wrote: > > > > > > > > > > > > > > > > > Hey there > > > > > First of all: It's great to have such an active community. > > > > > (@Kaan) My app is called MiuMeet. It's one of the leading location > based > > > Social/Dating Networks on mobile. > > > > > (@all) The dashboard gives me a nice idea about where the costs come > from. > > > I'd like to analyze Datastore read/writes more closely. All static > content > > > is cached forever externally (im using url cache busting) > > > > > (@yohan) It's run on python. It's a single app that generates this > amount > > > of traffic. I dont use thirdparty frameworks. Thanks for the pointer > with > > > the http cost return header. The problem is, since I do heavy caching, > this > > > header will only help me if the cache is cold. I can get an idea from > this > > > header but I would have to record a couple of thousand headers to get > an > > > idea. Is there no middleware for this like appstats? :) > > > > > (@jon) Why use sharded counters (maybe i've understood something > wrong) ? > > > I have built a pretty cool counter system for appengine that was in the > > > Google AppEngine blog: > > > > >http://googleappengine.blogspot.com/2011/10/prodeagle-analyzing-your-. > .. > > > > > (@Cayden) I use discounted hours heavily. I would also love to try > > > Python2.7 but as Brandon points out one has some reasons to be > hesitant. > > > > > (@Brandon) Your offer sounds interesting. I think you'd need a little > more > > > than 8h due to the size of the system - but maybe I underestimate the > power > > > of your mermaid costume :) I have built quite a few systems during my > time > > > as a Google Employee that are much bigger than MiuMeet and have a lot > of > > > ideas how to optimize MiuMeet. My main problem is that I need to > figure out > > > quickly how much cost I can save with which optimization. And therefor > I'd > > > like to measure better where the money is spent. When I see where it is > > > spent I will probably have an idea how to optimize it. My problem is > not > > > the engineering challenge but the time to implement all optimizations. > So > > > I'd like to start with the low hanging fruits. But I will def think > about > > > your offer. > > > > > (@all) As I mentioned above: Given that there is a http return header > that > > > estimates the cost of a request, shouldn't it be quite straight > forward to > > > build a middleware like appstats that lists cost per request path? (I > > > haven't looked at all at building middlewares) > > > > > Cheers and thanks to everyone for the replies > > > -Andrin > > > > > On Tue, Jan 3, 2012 at 4:54 AM, Brandon Wirtz <[email protected]> > wrote: > > > > >> Cayden, > > > > >> I'm a big fan of Python 2.7, but I wouldn't dream of telling someone > > >> running > > >> this large of an App to move to it right now. I've seen what happens > when > > >> the scheduler gets things wrong, and that "increased latency" results > in > > >> "massive time outs". > > > > >> Python 2.7 is awesome if all of your requests are under 7 seconds. > But our > > >> experience has been that if you have more than 1 in 50 requests taking > > >> more > > >> than 7 seconds python 2.7 will buckle under load. > > > > >> I blame the scheduler, and that might not be the problem, but I know > that > > >> when you start having Long requests things start timing out, and it > > >> appears > > >> to be that the scheduler is willing to stack things poorly, under load > > >> these > > >> timeouts cascade. > > > > >> -Brandon > > > > >> -----Original Message----- > > >> From: [email protected] > > >> [mailto:[email protected]] On Behalf Of Cayden Meyer > > >> Sent: Monday, January 02, 2012 7:10 PM > > >> To: Google App Engine > > >> Subject: [google-appengine] Re: Where to start optimizing cost? > > > > >> Hi Andrin, > > >> The admin console can provide a great deal of information on where > your > > >> costs are coming from. > > >> I know that you specified that you wanted ways to monitor rather than > > >> solutions to reduce costs, however these are four fairly easy to > measure > > >> changes: > > >> - If your application is not CPU bound you may wish to migrate to > > >> Python2.7. > > >> This can lower the number of instances required to serve the same > amount > > >> of > > >> traffic. Changes can be seen by comparing the number of active > instances > > >> with python2.7 to serve traffic y vs activate instances with > python2.5 to > > >> serve traffic y. Note: Python 2.7 is currently experimental and the > your > > >> latency may increase when moving to Python2.7. > > >> - Using memcache can reduce datastore operations. > > >> - Use edge caching where possible, this can reduce the number of > instances > > >> required to serve traffic. > > > > >> - If you can roughly predict the amount of traffic you will receive, > > >> discount instance hours are a good way to reduce costs. > > >> Hope this helps you optimize your application. There are more ways to > > >> optimize your application, however these are just a few simple ones > which > > >> can make a quite a difference. > > >> Cayden MeyerProduct Manager, Google App Engine > > > > >> On Jan 1, 2:42 am, Andrin von Rechenberg <[email protected]> wrote: > > >> > Hey there > > > > >> > I'm an absolute GAE-Lover! The only thing that is bothering me is > cost. > > >> > We spend about $10'000 a month in GAE cost. That's just too much for > > >> > the traffic we serve. We are starting to look around for > alternatives, > > >> > but I'd really love to stay with GAE and just optimize the system. > > > > >> > One of GAE's biggest flaws IMHO is that it is really hard to see > where > > >> > the costs are coming from. (Yes we have appstats in place, but our > > >> > system is just too big to manually sample hundreds of requests). > > > > >> > So here is a feature request that might help us optimize and stay > with > > >> GAE: > > >> > *Show cost per URI in the dashboard. That would be incredibly > > >> > helpful.* > > > > >> > Does anyone know of an existing elegant way to figure out where the > > >> > cost come from? (I'm looking more for a way to measure rather than > > >> > software solutions like "use memcache"). I could do something with > > >> > prodeagle.com and estimate the cost per request myself but I'd > rather > > >> > use an already existing solution... > > > > >> > Cheers, > > >> > -Andrin > > > > >> -- > > >> You received this message because you are subscribed to the Google > Groups > > >> "Google App Engine" group. > > >> To post to this group, send email to > [email protected]. > > >> To unsubscribe from this group, send email to > > >> [email protected]. > > >> For more options, visit this group at > > >>http://groups.google.com/group/google-appengine?hl=en. > > > > >> -- > > >> You received this message because you are subscribed to the Google > Groups > > >> "Google App Engine" group. > > >> To post to this group, send email to > [email protected]. > > >> To unsubscribe from this group, send email to > > >> [email protected]. > > >> For more options, visit this group at > > >>http://groups.google.com/group/google-appengine?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
