Thanks Nick. Yes, that's our thinking too. At the moment we've removed all 'error handling' code that used to run after the DeadlneExceededException and also removed AppStats. Seems to work. I wish AppEngine gave a slightly longer window to clean up e.g 5 seconds. Thanks again for taking time to give a helpful reply. Much appreciated.
On Fri, Sep 9, 2016 at 4:29 AM, 'Nick (Cloud Platform Support)' via Google App Engine <[email protected]> wrote: > Hey Damith, > > I believe I can shed some light on this situation. As explained in the > docs, > > If the DeadlineExceededError *is* caught but a response is not produced > quickly enough (you have less than a second), the request is aborted and a > 500 internal server error is returned. > > > Another possible cause of 104 errors is: > > In the Java runtime, if the DeadlineExceededException is not caught, an > uncatchableHardDeadlineExceededError is thrown. The instance is > terminated in both cases, but theHardDeadlineExceededError does not give > any time margin to return a custom response. To make sure your request > returns within the allowed time frame, you can use theApiProxy. > getCurrentEnvironment().getRemainingMillis() > <https://cloud.google.com/appengine/docs/java/javadoc/com/google/apphosting/api/ApiProxy.Environment#getRemainingMillis--> > method > to checkpoint your code and return if you have no time left. The Java > runtime page > <https://cloud.google.com/appengine/docs/java/#The_Request_Timer> contains > an explaination on how to use this method. If concurrent requests are > enabled through the "threadsafe" flag, every other running concurrent > request is killed with error code 104 > > > In line with the first quoted documentation above, it's possible that the > the Datastore latency (the kind which would have caused the Datastore calls > to go so long that the request itself would be facing a deadline error) > could be causing the AppStats writing of the DeadlineExceeded exception > itself to go so long that the response is not produced "quickly enough", > leading to the observed error. > > In general, it appears that the code being written to handle a > DeadlineExceeded error, while a good thing, can tend to provide buffer room > for tolerating the system being very close to deadline often enough that a > slight change to Datastore latency could cause a certain proportion of > requests to fail. AppStats being used to capture exceptions, and the > deadline limit itself being announced via exception (thus causing the > AppStats machine to start working, possibly for too long a duration leading > to an absolute crash), this can lead to some moderately complex failure > scenarios. > > I believe this entire class of errors would be avoided by taking a look at > whatever long-running activity is causing the requests to run so close to > the deadline, and shifting that activity to a Task Queue or other form of > processing which doesn't take place directly within the App Engine request > handler. Another option would be to switch to Basic scaling, which does not > have a 60 second Deadline > <https://cloud.google.com/appengine/docs/java/how-requests-are-handled#Java_The_request_timer> > . > > Cheers, > > Nick > Cloud Platform Community Support > > On Thursday, August 25, 2016 at 8:17:10 PM UTC-4, Damith C Rajapakse wrote: >> >> Thanks again for the help Nick. Yes, we are looking into reducing the >> stack traces. Again, this warning should not happen suddenly after so many >> years in operation. >> On a related noted, this and a few other errors (e.g. 104 errors) that >> came in a wave at the same time (possibly after the recent GAE update) >> feels like some dials on the GAE side was tweaked that affected various >> constraints applied on apps. For example, errors we we were previously able >> to handle by catching the DeadlineExceededError suddenly started giving 104 >> errors. As I've not seen too many reports of similar problems from other >> app devs, this situation is probably affecting only those apps that handle >> high-latency datastore-heavy requests (like our app). >> > -- > You received this message because you are subscribed to a topic in the > Google Groups "Google App Engine" group. > To unsubscribe from this topic, visit https://groups.google.com/d/ > topic/google-appengine/xXAJQLRLL-I/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/google-appengine. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/google-appengine/7a726dc0-2567-49d4-a3ca- > 71f490eccc78%40googlegroups.com > <https://groups.google.com/d/msgid/google-appengine/7a726dc0-2567-49d4-a3ca-71f490eccc78%40googlegroups.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/CAM_V5JGuyBhHLo5xaxgay%3DmrS9AqTX%3Drot_Z1uOwUNx9Y3qgtA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
