I forgot to mention: the model that always seem to be part of this
avalanche has many composite indexes: 63.

j

On Dec 2, 2:01 pm, Jason Collins <[email protected]> wrote:
> Hi Per,
>
> We have been seeing similar bursts of DeadlineExceededErrors (DEE) on
> our application (appid: steprep), and we have been seeing this for
> some time (weeks).
>
> The symptom: datastore put()'s that normally take 100-150ms to
> complete suddenly take 30-s+ and cause a DEE. Usually, many processes
> are impacted by this at the same moment, yielding an "avalanche" of
> errors (I like your term!). We are not using Entity Groups and the
> processes that are part of the avalanche are operating over different
> entities.
>
> It seems like some other common resource is blocking a bunch of
> requests, which all end up timing out at the same moment.
>
> j
>
> On Dec 2, 6:14 am, Per Larsson <[email protected]> wrote:
>
>
>
> > I have reported this to the issue tracker 
> > here:http://code.google.com/p/googleappengine/issues/detail?id=4180.
> > Cross-posting to this list in case the community can help us out....
>
> > Our app has fairly high traffic, about 15-20 QPS. We are experiencing
> > short bursts of 500 errors that make our app completely unusuable for
> > short periods of time. We're not completely sure what's going wrong,
> > but this is our guess:
>
> > Assumption #1: App engine will never fire up more number instances than
> > your current QPS. Assumption #2: One instance will never handle two
> > requests concurrently (why?) Assumption #3: If your requests take more
> > than one second to execute on average there will not be enough
> > instances to handle them and they will fail with 500 and the log
> > message "Request was aborted after waiting too long to attempt to
> > service your request...", sometimes tagged with throttle_code=2
> > whatever that means. Sometimes our application experiences short spikes
> > in latency for datastore writes (we even see occational
> > DeadlineExceededExceptions in commits and puts). During these spikes we
> > get an avalanche of 500 "Request was aborted..." errors from lots of
> > different URLs. I guess of what's happening is that a few stalled
> > requests locks up all our available instances. These spikes appear
> > about once an hour and last for five minutes or so. Our requests
> > normally run in under 1000 ms, with some headroom to spare. We have
> > tried to work around the problem by optimizing our requests, but we
> > can't really get away from the fact that our app needs to write to the
> > datastore quite often. Attached is a screenshot from the dashboard
> > showing simultaneous spikes in latency and error rates.
>
> >  spikes.png
> > 57KViewDownload

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to