Really, looking forward to see reply from Googler(s) regarding this.

On Thursday, July 19, 2012 12:11:27 AM UTC+2, Michael Hermus wrote:
>
> I certainly appreciate the attempted explanation. Correct me if I am 
> wrong, but what you said amounts to this: 
>
> Google has to route requests to cold start instances because otherwise, 
>> for large scale apps, the pending queue might get way too big, not to 
>> mention it could blow up sometime.
>>
>
> If that is true, I suppose I would be quite surprised, for the following 
> reasons:
>
> a) Google's entire infrastructure is designed for EVERYTHING to scale 
> massively and still work well.
> b) By waiting for instances to warm up first, I don't think you would 
> really increase the maximum depth of the pending queue by a whole lot. In 
> fact, the larger your app (i.e. the more instances you have active), the 
> less impact it would have relative to the alternative.
> c) I don't think the pending queue is 'hosted' on a single machine; I am 
> pretty sure it relies on a resilient queue infrastructure designed to 
> tolerate failures and scale well.
>
>
> On Wednesday, July 18, 2012 5:07:04 PM UTC-4, hyperflame wrote:
>>
>> On Jul 18, 2:38 pm, Michael Hermus <[email protected]> wrote: 
>> > I dont believe that you (or anyone) has sufficiently explained how 
>> sending user requests to cold instances is ever better than warming it up 
>> first.. Said request can ALWAYS be pulled right off the pending queue and 
>> sent to the instance as soon as it is ready. 
>>
>> Let me take a shot at this. Brandon Wirtz touched on it before, when 
>> he said "I agree that the Queue is sub optimal, but it is more sub 
>> optimal the smaller you are.  When you get to 50 instances it is 
>> amazing how well the load balancing works. " 
>>
>> Suppose you have a massive store (we'll call it MegaWalmart). 
>> MegaWalmart has 100 staffed checkout lanes. Suppose all of these lanes 
>> have a single queue of people lined up, and a supervisor which sends 
>> customers to open checkout lanes (roughly analogous to your preferred 
>> way of handling GAE request queuing). That one line would be huge, 
>> would block traffic around the store, etc. It's far better, from 
>> MegaWalmart's POV, to have multiple check out lines, one line per each 
>> lane. 
>>
>> Now suppose you have another checkout lane open up. (remember, now we 
>> have one line per lane) That's an additional 1% capacity. Now if the 
>> checkout clerk is drunk/hungover/whatever, that additional lane will 
>> take extra time to open, annoying the customers lined up in that lane. 
>> From MegaWalmart's POV, who cares? Less than 1% of your customers were 
>> inconveniced. 99% of people still had a decent time checking out. 
>>
>> Let's apply this to the scheduler. Suppose there was one single queue 
>> of requests. At Google-scale, that queue of requests could easily 
>> exceed millions of entries, possibly billions. And God help you if the 
>> machine hosting the queue gets a hiccup, or an outright failure. Don't 
>> you agree that, at least at Google-scale, requests should immediately 
>> be shunted to instance-level queues? Even if a single instance takes 
>> forever, or fails, we don't have to care: such a failure would only 
>> affect 0.000001% of users. 
>>
>> This leads me to my final point: My understanding, from reading the 
>> documentation and blog/news posts about GAE, is that the the core of 
>> GAE is ripped pretty much directly from production Google services. 
>> The problem with this is, the scheduler is intended to work at very 
>> high scale, not at low scale. And frankly, this makes sense when you 
>> consider a lot of the finer points of the GAE ecosystem. 
>>
>> So, to fix this: GAE needs to have a good relook at scheduler code, 
>> and rewrite it so that it has two different rules for apps at less 
>> than 50 instances, and more than 50 instances. Additionally, perhaps 
>> the GAE should look at making the scheduler smarter; perhaps it could 
>> measure the startup time of instances, and in the future, not send 
>> requests to cold instances until that startup time has elapsed. 
>>
>> Personal thoughts: I have admined a corporate GAE app that has 
>> exceeded 100 instances, and I use GAE for personal apps that use, at 
>> max, 3-4 instances. When you use GAE at these two extremes, you really 
>> get an understanding of how GAE scales. For instance, a personal 
>> anecdote: for my low end apps, I occasionally notice that GAE starts 
>> up a new idle instance. I'm not charged for it, it doesn't do any 
>> work, but it is counted in the "current instances" counter. My guess 
>> is that, during non-peak times, the GAE scheduler will load into 
>> memory additional instances of low end apps, to try and be ready for 
>> quick scaling.  So I believe the GAE team tries to handle low-end 
>> instances, but it does need more work. 
>>
>> TLDR: the scheduler needs more work, and MegaWalmart is the same thing 
>> as Google's scheduler.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To view this discussion on the web visit 
https://groups.google.com/d/msg/google-appengine/-/eMMuPExQKeAJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to