Hey Damith, I believe I can shed some light on this situation. As explained in the docs,
If the DeadlineExceededError *is* caught but a response is not produced quickly enough (you have less than a second), the request is aborted and a 500 internal server error is returned. Another possible cause of 104 errors is: In the Java runtime, if the DeadlineExceededException is not caught, an uncatchableHardDeadlineExceededError is thrown. The instance is terminated in both cases, but theHardDeadlineExceededError does not give any time margin to return a custom response. To make sure your request returns within the allowed time frame, you can use the ApiProxy.getCurrentEnvironment().getRemainingMillis() <https://cloud.google.com/appengine/docs/java/javadoc/com/google/apphosting/api/ApiProxy.Environment#getRemainingMillis--> method to checkpoint your code and return if you have no time left. The Java runtime page <https://cloud.google.com/appengine/docs/java/#The_Request_Timer> contains an explaination on how to use this method. If concurrent requests are enabled through the "threadsafe" flag, every other running concurrent request is killed with error code 104 In line with the first quoted documentation above, it's possible that the the Datastore latency (the kind which would have caused the Datastore calls to go so long that the request itself would be facing a deadline error) could be causing the AppStats writing of the DeadlineExceeded exception itself to go so long that the response is not produced "quickly enough", leading to the observed error. In general, it appears that the code being written to handle a DeadlineExceeded error, while a good thing, can tend to provide buffer room for tolerating the system being very close to deadline often enough that a slight change to Datastore latency could cause a certain proportion of requests to fail. AppStats being used to capture exceptions, and the deadline limit itself being announced via exception (thus causing the AppStats machine to start working, possibly for too long a duration leading to an absolute crash), this can lead to some moderately complex failure scenarios. I believe this entire class of errors would be avoided by taking a look at whatever long-running activity is causing the requests to run so close to the deadline, and shifting that activity to a Task Queue or other form of processing which doesn't take place directly within the App Engine request handler. Another option would be to switch to Basic scaling, which does not have a 60 second Deadline <https://cloud.google.com/appengine/docs/java/how-requests-are-handled#Java_The_request_timer> . Cheers, Nick Cloud Platform Community Support On Thursday, August 25, 2016 at 8:17:10 PM UTC-4, Damith C Rajapakse wrote: > > Thanks again for the help Nick. Yes, we are looking into reducing the > stack traces. Again, this warning should not happen suddenly after so many > years in operation. > On a related noted, this and a few other errors (e.g. 104 errors) that > came in a wave at the same time (possibly after the recent GAE update) > feels like some dials on the GAE side was tweaked that affected various > constraints applied on apps. For example, errors we we were previously able > to handle by catching the DeadlineExceededError suddenly started giving 104 > errors. As I've not seen too many reports of similar problems from other > app devs, this situation is probably affecting only those apps that handle > high-latency datastore-heavy requests (like our app). > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/7a726dc0-2567-49d4-a3ca-71f490eccc78%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
