Thanks!
The migration has fixed the problem, error rates seem to be back to
normal.

The problem started occurring 2 days ago; I would say the error rates
had a strong correlation with the latency chart at
http://code.google.com/status/appengine/detail/serving/2010/02/23#ae-trust-detail-helloworld-get-latency

The symptoms I was having were the same as in
http://code.google.com/p/googleappengine/issues/detail?id=1695 (see
comment 10, 13 and 14, from September 16).

Now, on both occasions, I spot a pattern: the DeadlineException kicks
in correctly at about 30 seconds, but even though my app is not
sleeping the used CPU is always pretty low (in the order of two
seconds; on September 16 I was seeing only 1 second).

>From this observation, a brute force approach to detecting the problem
could be having watchdog process on a busy loop running on each shard
for 5 seconds (for example); if the watchdog is killed by the 30
seconds deadline you probably have a starvation problem.

RPC does not seem to be the problem here, as I think my app timed-out
before it had a chance to do any RPC calls.

Also, for debugging purposes maybe the current shards information
could be exposed somewhere on the admin interface.

Regards, and thanks again,
Pedro Morais


On Feb 23, 10:59 pm, "Ikai L (Google)" <[email protected]> wrote:
> Pedro,
>
> I've migrated your application. Can you check and report back what is
> happening? If your application errors at a rate that is more closer to
> normal parameters, then we have an issue with a hotspot. Do you know if the
> errors increased more at certain times in the day, or were the increased
> number of errors at all times?
>
>
>
>
>
> On Tue, Feb 23, 2010 at 2:47 PM, John Gardner <[email protected]> wrote:
> > I'm seeing the same thing on one of my apps.  The same codebase with
> > another appid appears to be fine.  Is an appid tied to certain
> > instances, and some groups of instances are dead/dying?
>
> > On Feb 23, 1:46 pm, Pedro Morais <[email protected]> wrote:
> > > Can anyone from Google take a look at the issue please?
>
> > > My application is basically unusable right now.
>
> > > Regards,
> > > Pedro Morais
>
> > > On Feb 23, 3:32 pm, Pedro Morais <[email protected]> wrote:
>
> > > > Hi,
>
> > > > Looking athttp://
> > code.google.com/status/appengine/detail/serving/2010/02/22#ae-...
> > > > you can see that there was a spike yesterday in request serving
> > > > latency.
>
> > > > The comment says "Investigation Complete - Issue Resolved
> > > > We have determined that this spike did not affect the performance or
> > > > uptime of applications. If you feel we have incorrectly diagnosed this
> > > > issue please inform us by posting in our developer forum."
>
> > > > Well, it did affect the performance and uptime of at least one of my
> > > > (Python) applications, as reported in the following production issue
> > > > (still in the new state):
> >http://code.google.com/p/googleappengine/issues/detail?id=2837
>
> > > > Today, we are seeing the same spike (right now the chart is in a spike
> > > > before 8AM and at about 650):
> >http://code.google.com/status/appengine/detail/serving/2010/02/23#ae-...
>
> > > > And again, my app is having a lot of Deadline exceptions.
>
> > > > Looking at the logs, the thing that I think is most relevant is that
> > > > my app only gets to spend about 2000 CPU ms before de 30000 ms period
> > > > finished - so it's severely starved for resources - so starved that it
> > > > doesn't even manage to load Django 1.1 before the Deadline kicks in.
>
> > > > Regards,
> > > > Pedro Morais
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google App Engine" group.
> > To post to this group, send email to [email protected].
> > To unsubscribe from this group, send email to
> > [email protected]<google-appengine%2Bunsubscrib 
> > [email protected]>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/google-appengine?hl=en.
>
> --
> Ikai Lan
> Developer Programs Engineer, Google App 
> Enginehttp://googleappengine.blogspot.com|http://twitter.com/app_engine

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to