On Tue, Jul 17, 2012 at 5:21 AM, Takashi Matsuo <[email protected]> wrote:
>
> On Tue, Jul 17, 2012 at 7:10 AM, Jeff Schnitzer <[email protected]> wrote:
>>
>> Hi Takashi.  I've read the performancesettings documentation a dozen
>> times and yet the scheduler behavior still seems flawed to me.
>
> I would rather not use a word 'flawed' here, but probably there is still
> room for improvement. First of all, is there any reason why you can not use
> min idle instances settings? Is it just because of the cost?

My goal is to have every request serviced with minimum latency possible.

Leaving aside implementation complexity, there doesn't seem to be any
circumstance when it is efficient to remove a request from the pending
queue and lock it into a cold start.  There are really two cases:

 1) The request is part of a sudden burst.  The request will be
processed by an active instance before any new instances come online.
It therefore should stay in the queue.

 2) The request is part of new, sustained traffic.  Whether the
request waits in the pending queue for new instances to warm up, or
waits at a specific cold instance, the request is still going to wait.
 At least if it's still in the pending queue there's a chance it will
get routed to the first available instance... which overall is likely
going to be better than any particular (mis-)behaving start.

Imagine you're at the supermarket checkout.  The ideal situation is to
have everyone waiting in one line and then route them off to cashiers
as they become available.  If the pending queue gets too long, you
open more cashiers (possibly multiple at once) until the line shrinks.
 If every cashier has a separate queue, it's really hard to optimize
the # of cashiers since you have some with a line 5 deep and some
sitting idle.

I'm fully prepared to believe there are implementation complexities
that make the single-pending-queue difficult, but what I'm hearing is
that Google deliberately *wants* to send requests to cold starts...
which seems "flawed" to me.  Am I missing something?

> If so, can 'introducing somewhat lower price for resident instance' be a
> workable feature request?
>
> Vaguely I have a feeling that, what you're trying to accomplish here is to
> save money while acquire good performance. If so, it is one of the most
> difficult thing to implement. However, in my opinion, it's worth trying to
> implement, so let's continue the discussion.

Let's forget price for a moment and just try to work towards the goal
of having an efficient system.  Presumably, a more efficient system
will be more cost-effective than a less efficient system that has lots
of idle instances sitting around hogging RAM.  Good for Google, good
for us.

> If you have an app with average +50s loading time, I totally understand that
> you strongly want to avoid sending requests to cold instances. On the other
> hand, there are also many well-behaved apps with <5 secs loading/warming
> time. Please understand that for those apps, it is still acceptable if we
> send requests to cold instances, so it's likely we can not prioritize the
> feature bellow over other things, however...

Someone at Google must have chart which has "Time" on the X axis and
"% of Startup Requests" on the Y axis - basically a chart of what
percentage of startup requests in the Real World are satisfied at
various time boundaries.  I have a pretty good idea, I think, of what
this chart looks like.  I'm also fairly certain that the Python chart
looks nothing like the Java chart.

For one thing, the Java chart _starts_ at 5s.  The bare minimum Hello,
World that creates a PersistenceManagerFactory with one class (the
Employee in the docs) takes 5-6s to start up.  And this is when GAE is
healthy; that time can easily double on a bad day.

So if you optimize GAE for apps with a <5s startup time, you're
optimizing for apps that don't exist - at least on the JVM.  I'd be
*very* surprised if the average real-world Java instance startup time
was less than 20s.  You just don't build apps that way in Java.  Given
a sophisticated application, I'm not even sure it's possible unless
the only datatypes you allow yourself are ArrayList and HashMap.

>> The min latency setting is actually working against us here.  What I
>> really want is a high (possibly infinite) minimum latency for moving
>> items from pending queue to a cold instance, but a low minimum latency
>> for warming up new instances.  I don't want requests waiting in the
>> pending queue, but it does me no good to have them sent to cold
>> instances.  I'd rather they wait in the queue until fresh instances
>> come online.
>
> For me, it look like a great idea. Can you file a feature request for this,
> so that we can get a rough idea of how many people want it, and start an
> internal discussion.

http://code.google.com/p/googleappengine/issues/detail?id=7865

I generalized it to "User-facing requests should never be locked to
cold instance starts".

Thanks,
Jeff

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to