On Tue, Jul 17, 2012 at 5:21 AM, Takashi Matsuo <[email protected]> wrote: > > On Tue, Jul 17, 2012 at 7:10 AM, Jeff Schnitzer <[email protected]> wrote: >> >> Hi Takashi. I've read the performancesettings documentation a dozen >> times and yet the scheduler behavior still seems flawed to me. > > I would rather not use a word 'flawed' here, but probably there is still > room for improvement. First of all, is there any reason why you can not use > min idle instances settings? Is it just because of the cost?
My goal is to have every request serviced with minimum latency possible. Leaving aside implementation complexity, there doesn't seem to be any circumstance when it is efficient to remove a request from the pending queue and lock it into a cold start. There are really two cases: 1) The request is part of a sudden burst. The request will be processed by an active instance before any new instances come online. It therefore should stay in the queue. 2) The request is part of new, sustained traffic. Whether the request waits in the pending queue for new instances to warm up, or waits at a specific cold instance, the request is still going to wait. At least if it's still in the pending queue there's a chance it will get routed to the first available instance... which overall is likely going to be better than any particular (mis-)behaving start. Imagine you're at the supermarket checkout. The ideal situation is to have everyone waiting in one line and then route them off to cashiers as they become available. If the pending queue gets too long, you open more cashiers (possibly multiple at once) until the line shrinks. If every cashier has a separate queue, it's really hard to optimize the # of cashiers since you have some with a line 5 deep and some sitting idle. I'm fully prepared to believe there are implementation complexities that make the single-pending-queue difficult, but what I'm hearing is that Google deliberately *wants* to send requests to cold starts... which seems "flawed" to me. Am I missing something? > If so, can 'introducing somewhat lower price for resident instance' be a > workable feature request? > > Vaguely I have a feeling that, what you're trying to accomplish here is to > save money while acquire good performance. If so, it is one of the most > difficult thing to implement. However, in my opinion, it's worth trying to > implement, so let's continue the discussion. Let's forget price for a moment and just try to work towards the goal of having an efficient system. Presumably, a more efficient system will be more cost-effective than a less efficient system that has lots of idle instances sitting around hogging RAM. Good for Google, good for us. > If you have an app with average +50s loading time, I totally understand that > you strongly want to avoid sending requests to cold instances. On the other > hand, there are also many well-behaved apps with <5 secs loading/warming > time. Please understand that for those apps, it is still acceptable if we > send requests to cold instances, so it's likely we can not prioritize the > feature bellow over other things, however... Someone at Google must have chart which has "Time" on the X axis and "% of Startup Requests" on the Y axis - basically a chart of what percentage of startup requests in the Real World are satisfied at various time boundaries. I have a pretty good idea, I think, of what this chart looks like. I'm also fairly certain that the Python chart looks nothing like the Java chart. For one thing, the Java chart _starts_ at 5s. The bare minimum Hello, World that creates a PersistenceManagerFactory with one class (the Employee in the docs) takes 5-6s to start up. And this is when GAE is healthy; that time can easily double on a bad day. So if you optimize GAE for apps with a <5s startup time, you're optimizing for apps that don't exist - at least on the JVM. I'd be *very* surprised if the average real-world Java instance startup time was less than 20s. You just don't build apps that way in Java. Given a sophisticated application, I'm not even sure it's possible unless the only datatypes you allow yourself are ArrayList and HashMap. >> The min latency setting is actually working against us here. What I >> really want is a high (possibly infinite) minimum latency for moving >> items from pending queue to a cold instance, but a low minimum latency >> for warming up new instances. I don't want requests waiting in the >> pending queue, but it does me no good to have them sent to cold >> instances. I'd rather they wait in the queue until fresh instances >> come online. > > For me, it look like a great idea. Can you file a feature request for this, > so that we can get a rough idea of how many people want it, and start an > internal discussion. http://code.google.com/p/googleappengine/issues/detail?id=7865 I generalized it to "User-facing requests should never be locked to cold instance starts". Thanks, Jeff -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
