Re: [google-appengine] Re: Why resident instances in auto scaling are idle?

'Jordan (Cloud Platform Support)' via Google App Engine Fri, 28 Oct 2016 13:59:10 -0700

Hey Vidya.

You are correct that the instance start time is greatly based on your code, 
as each time a new instance is created it must load and prepare a fresh 
copy of your code to serve.

As for the reason why you are seeing a single instance handling the bulk of
your requests, this comes down to the App Engine scheduler as you have
mentioned. The scheduler will simply ask the first instance if it can
handle a request. Based on your scaling configuration for pending latency
and concurrent requests, your first instance will tell the scheduler that
it can handle an extra request, and so it does; leaving the rest of your
instances waiting to handle any overflow.

If App Engine thinks you may need an extra instance warmed up just in case
of overflow, it will create one. This is why you see a single Dynamic
instance at the bottom handling no requests. Again, App Engine sends
requests to Dynamic instances and not idle Resident instance. If there is
no available Dynamic instance, your Resident Instance will be treated as a
Dynamic instance and a new Resident Instance will be kicked up to meet your
configured
<https://cloud.google.com/appengine/docs/java/config/appref#scaling_elements>
minimum idle instances.

To configure your scaling options
<https://cloud.google.com/appengine/docs/java/config/appref#scaling_elements>to
force requests to be more spread across available instances, simply reduce
the amount of concurrent requests a single instance is allowed to handle,
reduce the minimum pending latency a request is allowed to wait in an
instance's pending queue for, and reduce the max pending latency to force a
request to be handled by a new instance after a period of time. Note, I
would not recommend setting any of these to zero forcing each request to be
handled by a single instance. This is because you still want multiple
requests to be handled by each instance, to balance cost and performance.

Continue to use the Stackdriver Trace <https://cloud.google.com/trace/>
tool to see the breakdown of latency for requests, and use this to
configure the optimal scaling settings for your app so that requests are
not waiting too long in a pending queue for other requests in front of it
to finish. Ideally optimizing your code to execute requests very quickly in
an asynchronous style (such as using the Task Queue to perform long image
manipulation tasks instead of forcing a user to wait) will make your
application scalable for Cloud computing.

--
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/google-appengine.
To view this discussion on the web visit
https://groups.google.com/d/msgid/google-appengine/231d24e7-ac1c-4eba-bfb3-8fada9677094%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [google-appengine] Re: Why resident instances in auto scaling are idle?

Reply via email to