I certainly appreciate the attempted explanation. Correct me if I am wrong, 
but what you said amounts to this: 

Google has to route requests to cold start instances because otherwise, for 
> large scale apps, the pending queue might get way too big, not to mention 
> it could blow up sometime.
>

If that is true, I suppose I would be quite surprised, for the following 
reasons:

a) Google's entire infrastructure is designed for EVERYTHING to scale 
massively and still work well.
b) By waiting for instances to warm up first, I don't think you would 
really increase the maximum depth of the pending queue by a whole lot. In 
fact, the larger your app (i.e. the more instances you have active), the 
less impact it would have relative to the alternative.
c) I don't think the pending queue is 'hosted' on a single machine; I am 
pretty sure it relies on a resilient queue infrastructure designed to 
tolerate failures and scale well.


On Wednesday, July 18, 2012 5:07:04 PM UTC-4, hyperflame wrote:
>
> On Jul 18, 2:38 pm, Michael Hermus <[email protected]> wrote: 
> > I dont believe that you (or anyone) has sufficiently explained how 
> sending user requests to cold instances is ever better than warming it up 
> first.. Said request can ALWAYS be pulled right off the pending queue and 
> sent to the instance as soon as it is ready. 
>
> Let me take a shot at this. Brandon Wirtz touched on it before, when 
> he said "I agree that the Queue is sub optimal, but it is more sub 
> optimal the smaller you are.  When you get to 50 instances it is 
> amazing how well the load balancing works. " 
>
> Suppose you have a massive store (we'll call it MegaWalmart). 
> MegaWalmart has 100 staffed checkout lanes. Suppose all of these lanes 
> have a single queue of people lined up, and a supervisor which sends 
> customers to open checkout lanes (roughly analogous to your preferred 
> way of handling GAE request queuing). That one line would be huge, 
> would block traffic around the store, etc. It's far better, from 
> MegaWalmart's POV, to have multiple check out lines, one line per each 
> lane. 
>
> Now suppose you have another checkout lane open up. (remember, now we 
> have one line per lane) That's an additional 1% capacity. Now if the 
> checkout clerk is drunk/hungover/whatever, that additional lane will 
> take extra time to open, annoying the customers lined up in that lane. 
> From MegaWalmart's POV, who cares? Less than 1% of your customers were 
> inconveniced. 99% of people still had a decent time checking out. 
>
> Let's apply this to the scheduler. Suppose there was one single queue 
> of requests. At Google-scale, that queue of requests could easily 
> exceed millions of entries, possibly billions. And God help you if the 
> machine hosting the queue gets a hiccup, or an outright failure. Don't 
> you agree that, at least at Google-scale, requests should immediately 
> be shunted to instance-level queues? Even if a single instance takes 
> forever, or fails, we don't have to care: such a failure would only 
> affect 0.000001% of users. 
>
> This leads me to my final point: My understanding, from reading the 
> documentation and blog/news posts about GAE, is that the the core of 
> GAE is ripped pretty much directly from production Google services. 
> The problem with this is, the scheduler is intended to work at very 
> high scale, not at low scale. And frankly, this makes sense when you 
> consider a lot of the finer points of the GAE ecosystem. 
>
> So, to fix this: GAE needs to have a good relook at scheduler code, 
> and rewrite it so that it has two different rules for apps at less 
> than 50 instances, and more than 50 instances. Additionally, perhaps 
> the GAE should look at making the scheduler smarter; perhaps it could 
> measure the startup time of instances, and in the future, not send 
> requests to cold instances until that startup time has elapsed. 
>
> Personal thoughts: I have admined a corporate GAE app that has 
> exceeded 100 instances, and I use GAE for personal apps that use, at 
> max, 3-4 instances. When you use GAE at these two extremes, you really 
> get an understanding of how GAE scales. For instance, a personal 
> anecdote: for my low end apps, I occasionally notice that GAE starts 
> up a new idle instance. I'm not charged for it, it doesn't do any 
> work, but it is counted in the "current instances" counter. My guess 
> is that, during non-peak times, the GAE scheduler will load into 
> memory additional instances of low end apps, to try and be ready for 
> quick scaling.  So I believe the GAE team tries to handle low-end 
> instances, but it does need more work. 
>
> TLDR: the scheduler needs more work, and MegaWalmart is the same thing 
> as Google's scheduler.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/FVPDaFE41akJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to