I certainly appreciate the attempted explanation. Correct me if I am wrong, but what you said amounts to this:
Google has to route requests to cold start instances because otherwise, for > large scale apps, the pending queue might get way too big, not to mention > it could blow up sometime. > If that is true, I suppose I would be quite surprised, for the following reasons: a) Google's entire infrastructure is designed for EVERYTHING to scale massively and still work well. b) By waiting for instances to warm up first, I don't think you would really increase the maximum depth of the pending queue by a whole lot. In fact, the larger your app (i.e. the more instances you have active), the less impact it would have relative to the alternative. c) I don't think the pending queue is 'hosted' on a single machine; I am pretty sure it relies on a resilient queue infrastructure designed to tolerate failures and scale well. On Wednesday, July 18, 2012 5:07:04 PM UTC-4, hyperflame wrote: > > On Jul 18, 2:38 pm, Michael Hermus <[email protected]> wrote: > > I dont believe that you (or anyone) has sufficiently explained how > sending user requests to cold instances is ever better than warming it up > first.. Said request can ALWAYS be pulled right off the pending queue and > sent to the instance as soon as it is ready. > > Let me take a shot at this. Brandon Wirtz touched on it before, when > he said "I agree that the Queue is sub optimal, but it is more sub > optimal the smaller you are. When you get to 50 instances it is > amazing how well the load balancing works. " > > Suppose you have a massive store (we'll call it MegaWalmart). > MegaWalmart has 100 staffed checkout lanes. Suppose all of these lanes > have a single queue of people lined up, and a supervisor which sends > customers to open checkout lanes (roughly analogous to your preferred > way of handling GAE request queuing). That one line would be huge, > would block traffic around the store, etc. It's far better, from > MegaWalmart's POV, to have multiple check out lines, one line per each > lane. > > Now suppose you have another checkout lane open up. (remember, now we > have one line per lane) That's an additional 1% capacity. Now if the > checkout clerk is drunk/hungover/whatever, that additional lane will > take extra time to open, annoying the customers lined up in that lane. > From MegaWalmart's POV, who cares? Less than 1% of your customers were > inconveniced. 99% of people still had a decent time checking out. > > Let's apply this to the scheduler. Suppose there was one single queue > of requests. At Google-scale, that queue of requests could easily > exceed millions of entries, possibly billions. And God help you if the > machine hosting the queue gets a hiccup, or an outright failure. Don't > you agree that, at least at Google-scale, requests should immediately > be shunted to instance-level queues? Even if a single instance takes > forever, or fails, we don't have to care: such a failure would only > affect 0.000001% of users. > > This leads me to my final point: My understanding, from reading the > documentation and blog/news posts about GAE, is that the the core of > GAE is ripped pretty much directly from production Google services. > The problem with this is, the scheduler is intended to work at very > high scale, not at low scale. And frankly, this makes sense when you > consider a lot of the finer points of the GAE ecosystem. > > So, to fix this: GAE needs to have a good relook at scheduler code, > and rewrite it so that it has two different rules for apps at less > than 50 instances, and more than 50 instances. Additionally, perhaps > the GAE should look at making the scheduler smarter; perhaps it could > measure the startup time of instances, and in the future, not send > requests to cold instances until that startup time has elapsed. > > Personal thoughts: I have admined a corporate GAE app that has > exceeded 100 instances, and I use GAE for personal apps that use, at > max, 3-4 instances. When you use GAE at these two extremes, you really > get an understanding of how GAE scales. For instance, a personal > anecdote: for my low end apps, I occasionally notice that GAE starts > up a new idle instance. I'm not charged for it, it doesn't do any > work, but it is counted in the "current instances" counter. My guess > is that, during non-peak times, the GAE scheduler will load into > memory additional instances of low end apps, to try and be ready for > quick scaling. So I believe the GAE team tries to handle low-end > instances, but it does need more work. > > TLDR: the scheduler needs more work, and MegaWalmart is the same thing > as Google's scheduler. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/FVPDaFE41akJ. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
