I'd like to get a better understanding of what exactly these do. Say, for example, that I set Min Pending Latency to 1 second, and my request takes 500ms to respond. How long will that visitor have to wait to see the response? What if there are 10 simultaneous requests (using Python)? How long will each have to wait? AFAIU, 2 requests would each wait 1 second, the next 1.5secs, the next 2secs, etc. How does this help? All it seems to do is make the first request wait an extra 1 second, the next an extra .5sec. Everyone else waits the same. Not sure how that keeps the number of instances down. At what point does GAE spin up another instance?
And if I set Max Idle Instances, how does that help? Shouldn't there only be 1 idle instance, until it becomes active and spins up another idle? I am sure there is some arithmetic having to calculate the time it takes to spin up a new instance versus the typical response time vs. the app's history of traffic spikes. If GAE thinks there is a possibility that you might need 10 instances, because you've needed them at some time in the past, then it might want 10 idles instances. If you set the Max to 1, Id guess you'd get some 500's. Obviously, I do not understand the ramifications of these settings. I am hoping we can use this thread to explain it better, in more detail, than the docs. Thanks. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
