RE: [google-appengine] Why are several production issues related to DeadlineExceededErrors being ignored?

Brandon Wirtz Sat, 14 Jan 2012 14:15:37 -0800

Do you have a warm up handler configured in your Yaml?
If you don't then the new instance has to warm up and handle a request.
Specifying a Warm up that simply initializes some variables and  logs an
event "Warm up complete".
 
Should fix your issue.
 
I don't think you have "Platform issues" I think you have Google hasn't
documented all best practices issues.
 
 
 
From: [email protected]
[mailto:[email protected]] On Behalf Of Karl Rosaen
Sent: Saturday, January 14, 2012 6:26 AM
To: [email protected]
Subject: Re: [google-appengine] Why are several production issues related to
DeadlineExceededErrors being ignored?
 
Thanks Brandon.  Many of the DeadlineExceededErrors were occurring during
warmup requests, according to the stacktraces, during python import
statements.  I upped the number of idle instances in an attempt to mitigate
this sort of thrashing, and your advice makes sense for this case.  Our
pending latency is set to 'Automatic' on both ends.
 
I'm attaching some graphs from the period when this was the worst
 
Instances:
 
 
<https://lh4.googleusercontent.com/--AtYMbWJ4ek/TxGNT3nfp0I/AAAAAAAAUuE/hTlZ
m78Mc08/s1600/Screen%252520Shot%2525202012-01-14%252520at%2525209.08.59%2525
20AM.png> 
 
Requests per second:
 
 
<https://lh6.googleusercontent.com/-LoIlwGhvLrA/TxGOnvzGmSI/AAAAAAAAUuc/Sg07
YssPK_4/s1600/Screen%252520Shot%2525202012-01-14%252520at%2525209.17.39%2525
20AM.png> 
 
 
 
Milliseconds per request:
 
 
<https://lh5.googleusercontent.com/-A76zVs8CCEo/TxGNZ9kcpfI/AAAAAAAAUuQ/w20A
uPvgw50/s1600/Screen%252520Shot%2525202012-01-14%252520at%2525209.09.41%2525
20AM.png> 
 
 
This suggests that some higher latency handlers were hit (some people were
editing content), taking up the existing front end instances, after which
GAE was trying to spin up some dynamic instances to serve other requests.
But during warmup, there were DeadelineExceededErrors during file imports,
suggesting that the dynamic instances aren't being given enough time to
warmup.
 
Increasing the idle instances helps.  So perhaps the revised question, at
least for our particular situation is: why, under load, do the dynamic
instances timeout during warmup?  That seems to compound the problem as the
dynamic instances aren't able to serve the requests that are backed up,
leading to user visible 500 errors, and more attempts to dynamically load
instances.
 
Does my theory have any holes?  Is relying on dynamic instances to handle
spikes without 500 errors unrealistic?  I know the docs state, "A smaller
number of idle Instances means your application costs less to run, but may
encounter more startup latency during load spikes." but thrashing on
DeadlineExceededErrors during warmup seems to indicate that dynamic
instances can't be relied upon for load spikes at all right now.
 
 
-- 
You received this message because you are subscribed to the Google Groups
"Google App Engine" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/google-appengine/-/bYRgRhlKZjoJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/google-appengine?hl=en.


-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

RE: [google-appengine] Why are several production issues related to DeadlineExceededErrors being ignored?

Reply via email to