Really, looking forward to see reply from Googler(s) regarding this. On Thursday, July 19, 2012 12:11:27 AM UTC+2, Michael Hermus wrote: > > I certainly appreciate the attempted explanation. Correct me if I am > wrong, but what you said amounts to this: > > Google has to route requests to cold start instances because otherwise, >> for large scale apps, the pending queue might get way too big, not to >> mention it could blow up sometime. >> > > If that is true, I suppose I would be quite surprised, for the following > reasons: > > a) Google's entire infrastructure is designed for EVERYTHING to scale > massively and still work well. > b) By waiting for instances to warm up first, I don't think you would > really increase the maximum depth of the pending queue by a whole lot. In > fact, the larger your app (i.e. the more instances you have active), the > less impact it would have relative to the alternative. > c) I don't think the pending queue is 'hosted' on a single machine; I am > pretty sure it relies on a resilient queue infrastructure designed to > tolerate failures and scale well. > > > On Wednesday, July 18, 2012 5:07:04 PM UTC-4, hyperflame wrote: >> >> On Jul 18, 2:38 pm, Michael Hermus <[email protected]> wrote: >> > I dont believe that you (or anyone) has sufficiently explained how >> sending user requests to cold instances is ever better than warming it up >> first.. Said request can ALWAYS be pulled right off the pending queue and >> sent to the instance as soon as it is ready. >> >> Let me take a shot at this. Brandon Wirtz touched on it before, when >> he said "I agree that the Queue is sub optimal, but it is more sub >> optimal the smaller you are. When you get to 50 instances it is >> amazing how well the load balancing works. " >> >> Suppose you have a massive store (we'll call it MegaWalmart). >> MegaWalmart has 100 staffed checkout lanes. Suppose all of these lanes >> have a single queue of people lined up, and a supervisor which sends >> customers to open checkout lanes (roughly analogous to your preferred >> way of handling GAE request queuing). That one line would be huge, >> would block traffic around the store, etc. It's far better, from >> MegaWalmart's POV, to have multiple check out lines, one line per each >> lane. >> >> Now suppose you have another checkout lane open up. (remember, now we >> have one line per lane) That's an additional 1% capacity. Now if the >> checkout clerk is drunk/hungover/whatever, that additional lane will >> take extra time to open, annoying the customers lined up in that lane. >> From MegaWalmart's POV, who cares? Less than 1% of your customers were >> inconveniced. 99% of people still had a decent time checking out. >> >> Let's apply this to the scheduler. Suppose there was one single queue >> of requests. At Google-scale, that queue of requests could easily >> exceed millions of entries, possibly billions. And God help you if the >> machine hosting the queue gets a hiccup, or an outright failure. Don't >> you agree that, at least at Google-scale, requests should immediately >> be shunted to instance-level queues? Even if a single instance takes >> forever, or fails, we don't have to care: such a failure would only >> affect 0.000001% of users. >> >> This leads me to my final point: My understanding, from reading the >> documentation and blog/news posts about GAE, is that the the core of >> GAE is ripped pretty much directly from production Google services. >> The problem with this is, the scheduler is intended to work at very >> high scale, not at low scale. And frankly, this makes sense when you >> consider a lot of the finer points of the GAE ecosystem. >> >> So, to fix this: GAE needs to have a good relook at scheduler code, >> and rewrite it so that it has two different rules for apps at less >> than 50 instances, and more than 50 instances. Additionally, perhaps >> the GAE should look at making the scheduler smarter; perhaps it could >> measure the startup time of instances, and in the future, not send >> requests to cold instances until that startup time has elapsed. >> >> Personal thoughts: I have admined a corporate GAE app that has >> exceeded 100 instances, and I use GAE for personal apps that use, at >> max, 3-4 instances. When you use GAE at these two extremes, you really >> get an understanding of how GAE scales. For instance, a personal >> anecdote: for my low end apps, I occasionally notice that GAE starts >> up a new idle instance. I'm not charged for it, it doesn't do any >> work, but it is counted in the "current instances" counter. My guess >> is that, during non-peak times, the GAE scheduler will load into >> memory additional instances of low end apps, to try and be ready for >> quick scaling. So I believe the GAE team tries to handle low-end >> instances, but it does need more work. >> >> TLDR: the scheduler needs more work, and MegaWalmart is the same thing >> as Google's scheduler. > >
-- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/eMMuPExQKeAJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
