I did another count across all of our urls. 1 hour, 1.04p - 2.04p log time, 2010-09-20
DeadlineExceededErrors: 157 10-second timeouts: 179 Given the high rate of 10-second timeouts, can I assume that I am over the 1000ms threshhold? These errors represent almost all of the errors from our system. j On Sep 20, 2:59 pm, Jason C <[email protected]> wrote: > For one of my urls, for a 1 hour period (12.50p to 1.50p log time > 2010-09-20), I saw 89 DeadlineExceededErrors and 101 of the 10-second > timeouts. (appid: steprep) > > WRT DeadlineExceededError - these requests are normally _well_ below > the 30s boundary. > > j > > On Sep 20, 1:24 pm, Kenneth <[email protected]> wrote: > > > > > We're seeing a lot of 10 second and deadline errors today. Nothing > > like last week but it is still pretty bad today. > > > There are 21 non-task 10 second errors and 5 30 second deadline errors > > in the past 8 hours. The deadline errors are on calls that would > > normally take <500ms. > > > On Sep 20, 5:26 pm, Jason C <[email protected]> wrote: > > > > I _do_ believe I'm seeing otherwise - in the form of lots of deadline- > > > related errors on large spike jobs (e.g., mapreduce and other > > > continuation-styled jobs). > > > > Do you have any suggestions how I could measure this? > > > > j > > > > On Sep 20, 9:55 am, "Ikai Lan (Google)" <[email protected]> > > > wrote:> Task Queues and cron jobs should not. We encourage small tasks, > > > but in > > > > general tasks that take several seconds to run should not impact your > > > > autoscaling. If you're seeing otherwise, please let us know. > > > > > -- > > > > Ikai Lan > > > > Developer Programs Engineer, Google App Engine > > > > Blogger:http://googleappengine.blogspot.com > > > > Reddit:http://www.reddit.com/r/appengine > > > > Twitter:http://twitter.com/app_engine > > > > > On Mon, Sep 20, 2010 at 11:33 AM, Jason C <[email protected]> > > > > wrote: > > > > > Ikai, > > > > > > Do you have a definitive answer on whether or not task/cron requests > > > > > count towards the 1000ms threshold? There seems to be some confusion > > > > > and counter-evidence here. > > > > > > Including our cron/task requests, we run at 1500-2000ms / request. > > > > > This is largely because we have LOTS of taskqueue items and we tend to > > > > > do a fair amount of work in them. Further, when we do large spike jobs > > > > > (e.g., mapreduce), we see lots of deadline-related errors. > > > > > > What is the best way to know if we're above or below this threshold? > > > > > (appid: steprep) > > > > > > j > > > > > > On Sep 16, 7:41 pm, "Jan Z/ Hapara" <[email protected]> wrote: > > > > > > Hi Ikai - the behavior we are seeing suggests the "offline" tasks > > > > > > are > > > > > > subject to the same 1000msec rule as external requests. > > > > > > > Queuing up a number of tasks reliably results in the "Request was > > > > > > aborted after waiting too long to attempt to service your request" > > > > > > error - which is actually fine, BUT, the appengine kicks in the > > > > > > back- > > > > > > off algorithm. > > > > > > > This results in tasks that cycle for 20+ generations, with mean time > > > > > > between run attempts of 19hr+. > > > > > > > How do we know the 1000 msec rule is in effect? > > > > > > > The situation improves drastically if we introduce a large number of > > > > > > "no-op" tasks that complete in ~40 msec and skew the averages. > > > > > > > J > > > > > > > On Sep 17, 2:05 am, "Ikai Lan (Google)" > > > > > > <[email protected]<ikai.l%[email protected]> > > > > > > > wrote: > > > > > > > > Jason, I think your situation is fine. Offline tasks have the > > > > > > > property > > > > > that, > > > > > > > unlike user-facing tasks, do not require instant execution. If you > > > > > schedule > > > > > > > an offline task for "now", that actually means "when there's > > > > > > > capacity" > > > > > and > > > > > > > App Engine can allocate idle capacity to process your request. > > > > > > > Thus, > > > > > the > > > > > > > need to spin up additional instances is unnecessary in most > > > > > > > cases. Are > > > > > you > > > > > > > seeing that your tasks are backed up? > > > > > > > > On Thu, Sep 16, 2010 at 12:56 PM, bFlood <[email protected]> > > > > > > > wrote: > > > > > > > > "which in turn affects the capacity available for running > > > > > > > > offline > > > > > > > > tasks" - so, if you have a low volume site, you won't get that > > > > > > > > many > > > > > > > > instances for your tasks? likewise, if you have some user facing > > > > > > > > requests that go longer then 1000ms (by design or otherwise), > > > > > > > > the > > > > > > > > instances available for your tasks are impacted? or am I > > > > > > > > confused? > > > > > > > > > On Sep 16, 8:44 am, "Nick Johnson (Google)" > > > > > > > > <[email protected] > > > > > > > > > wrote: > > > > > > > > > Hi Jason, > > > > > > > > > > The same appservers are used to serve user-facing and offline > > > > > traffic. > > > > > > > > The > > > > > > > > > volume of user-facing traffic (that is below the latency > > > > > > > > > threshold) > > > > > you > > > > > > > > > serve determines how many appservers we provision for your > > > > > application, > > > > > > > > > which in turn affects the capacity available for running > > > > > > > > > offline > > > > > (task > > > > > > > > queue > > > > > > > > > and cron) tasks. > > > > > > > > > > -Nick Johnson > > > > > > > > > > On Thu, Sep 16, 2010 at 1:41 PM, Jason C < > > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > The number of instances that App Engine makes available to > > > > > > > > > > your > > > > > > > > > > application depends on if you keep your average request time > > > > > under > > > > > > > > > > 1000ms for user-facing requests. > > > > > > > > > > > Ikai Lan (I believe) said that taskqueue and cron job > > > > > > > > > > requests do > > > > > not > > > > > > > > > > count against this boundary. Ikai also said that this > > > > > > > > > > boundary > > > > > was in > > > > > > > > > > place because longer requests were bad for the ecosystem. > > > > > > > > > > > Since taskqueue and cron job requests do not count against > > > > > > > > > > this > > > > > > > > > > boundary, in order for them to not be bad for the > > > > > > > > > > ecosystem, I'm > > > > > > > > > > guessing that they are served from a different set of > > > > > > > > > > servers > > > > > than > > > > > > > > > > user-facing requests are. > > > > > > > > > > > We (appid: steprep) have a number of external machines that > > > > > > > > > > also > > > > > hit > > > > > > > > > > our urls. While we make every effort to keep user-facing > > > > > > > > > > requests > > > > > > > > > > quick and responsive, we often use many seconds serving the > > > > > requests > > > > > > > > > > that are built for external machines (by design). > > > > > > > > > > > It has only just struck me this morning that this could be > > > > > > > > > > having > > > > > a > > > > > > > > > > bad (perhaps dramatic) impact on our overall scaleability. > > > > > > > > > > > First off, is it true that cron and taskqueue items are > > > > > > > > > > served on > > > > > a > > > > > > > > > > different set of servers? If so, is there any way to > > > > > > > > > > designate > > > > > that a > > > > > > > > > > particular url is being requested by a machine and can be > > > > > > > > > > routed > > > > > to > > > > > > > > > > this alternate set (of presumably slower) servers (e.g., a > > > > > request > > > > > > > > > > header)? > > > > > > > > > > > If I'm way off on all of this, and if taskqueue and cron > > > > > > > > > > jobs are > > > > > > > > > > served from the same set of servers, I'm not sure how the > > > > > > > > > > "bad > > > > > for the > > > > > > > > > > ecosystem" argument holds, and perhaps Google should > > > > > > > > > > revisit this > > > > > > > > > > 1000ms boundary condition altogether. > > > > > > > > > > > -- > > > > > > > > > > You received this message because you are subscribed to the > > > > > Google > > > > > > > > Groups > > > > > > > > > > "Google App Engine" group. > > > > > > > > > > To post to this group, send email to > > > > > [email protected] > > > > > > > > . > > > > > > > > > > To unsubscribe from this group, send email to > > > > > > > > > > [email protected]<google-appengine%2Bunsubscrib > > > > > > > > > > [email protected]><google-appengine%2Bunsubscrib > > > > > [email protected]><google-appengine%2Bunsubscrib > > > > > > > > [email protected]> > > > > > > > > > > . > > > > > > > > > > For more options, visit this group at > > > > > > > > > >http://groups.google.com/group/google-appengine?hl=en. > > > > > > > > > > -- > > > > > > > > > Nick Johnson, Developer Programs Engineer, App Engine Google > > > > > Ireland Ltd. > > > > > > > > :: > > > > > > > > > Registered in Dublin, Ireland, Registration Number: 368047 > > > > > > > > > Google Ireland Ltd. :: Registered in Dublin, Ireland, > > > > > > > > > Registration > > > > > > > > Number: > > > > > > > > > 368047 > > > > > > > > > -- > > > > > > > > You received this message because you are subscribed to the > > > > > > > > Google > > > > > Groups > > > > > > > > "Google App Engine" group. > > > > > > > > To post to this group, send email to > > > > > [email protected]. > > > > > > > > To unsubscribe from this group, send email to > > > > > > > > [email protected]<google-appengine%2Bunsubscrib > > > > > > > > [email protected]><google-appengine%2Bunsubscrib > > > > > [email protected]> > > > > > > > > . > > > > > > > > For more options, visit this group at > > > > > > > >http://groups.google.com/group/google-appengine?hl=en. > > > > > > -- > > > > > You received this message because you are subscribed to the Google > > > > > Groups > > > > > "Google App Engine" group. > > > > > To post to this group, send email to > > > > > [email protected]. > > > > > To unsubscribe from this group, send email to > > > > > [email protected]<google-appengine%2Bunsubscrib > > > > > [email protected]> > > > > > . > > > > > For more options, visit this group at > > > > >http://groups.google.com/group/google-appengine?hl=en. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
