Re: [google-appengine] Re: (mostly) Consistent 20-second delay in starting backend tasks

Robert Kluin Sun, 05 Feb 2012 21:17:41 -0800

That's interesting.  Did the queue sit there for a long time not
running anything, or running tasks very slowly?  Are the tasks in that
queue generally long-running?


I _very_ infrequently bump into that type of issue, but I periodically
will see one queue slow down for a while.  It *seems* to happen far
more often in queues with slower tasks, but I don't have any recent
empirical evidence of that.  And I *think* I've been told that should
not be the case.


Robert



On Sun, Feb 5, 2012 at 19:27, Carter Maslan <[email protected]> wrote:
> Nicholas -
>
> For our examples of the 10-20 minute delay:
> app_id=s~camiologger
> queue=image-label
> (but several other queues experience the same long delays sometimes:
> content-process, counter-update, etc...)
>
> The tasks were not added with transactions; just this code:
> Queue queueP =
> QueueFactory.getQueue(ServerUtils.QUEUE_NAME_IMAGE_LABEL_PUSH);
> TaskHandle th = queueP.add(withUrl(ServerUtils.PATH_ADMIN_MOTION_LABEL)
>
> .param("key", contentKeyString)
>
> .method(TaskOptions.Method.GET));
>
>
> Let me know if you need more info.  We noticed this in the last few weeks.
> Carter
>
>
>
> On Sun, Feb 5, 2012 at 4:05 PM, Dave Loomer <[email protected]> wrote:
>>
>> As the OP you may be interested in my app ID as well: mn-live.  I
>> provided some logs a few posts back and some exhaustive details at the
>> beginning.
>>
>> However, you won't see this issue popping up anymore on my app since I
>> "solved" it by setting countdown=1 a week ago. Since then, tasks start
>> very reliably after a 1.5 second delay.  If I remove the countdown
>> parameter, then it returns to 20 seconds (+/- .01) pretty reliably.
>>
>> On Feb 5, 5:59 pm, Nicholas Verne <[email protected]> wrote:
>> > We would have no need to shoot anyone.
>> >
>> > However, the explanations quickly become obsolete. They enter the
>> > folklore in the form that was current at the time and become
>> > entrenched as incorrect information when the implementations have
>> > changed.
>> >
>> > Task Queues use best effort scheduling. They're not real time all the
>> > time, although when our best efforts are running smoothly they can
>> > appear real time. For scheduling, the task eta marks the earliest time
>> > at which the task can run. We can't guarantee that a task WILL run at
>> > that time.
>> >
>> > Steve, we're interested to know about the 10-20 minute delays you've
>> > seen. Can you tell us the app id, queue, and whether the tasks were
>> > added transactionally? An example from your logs would be very
>> > helpful.
>> >
>> > Nick Verne
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Mon, Feb 6, 2012 at 9:27 AM, stevep <[email protected]> wrote:
>> > > Carter wrote: We regularly but erratically see 10-20 minute delays in
>> > > running push queue tasks.
>> >
>> > > Been a burr under the saddle forever. What I really don't understand
>> > > -- assuming GAE engineers never see the benefit of providing at least
>> > > one priority/reliability queue -- is why the heck there is never any
>> > > explanation about how tasks get scheduled, and why these weird delays
>> > > happen. It is either: 1) If we told you we would have to shoot you, or
>> > > 2) We can't see the benefit of you understanding this.
>> >
>> > > -stevep
>> >
>> > > On Feb 5, 9:24 am, Carter <[email protected]> wrote:
>> > >> We regularly but erratically see 10-20 minute delays in running push
>> > >> queue tasks.
>> > >> The tasks sit in the queue with ETA as high as 20 minutes *ago*
>> > >> without any errors or retries.
>> >
>> > >> (the problem seems unrelated to queue settings since our Maximum
>> > >> Rate,
>> > >> Enorced Rate and Maximum Concurrent all far exceed the queue's
>> > >> throughput at the time of the delays)
>> >
>> > >> Any tips or clues on how to prevent this while still using push
>> > >> queues
>> > >> without backends?
>> >
>> > >> On Feb 1, 9:03 pm, Robert Kluin <[email protected]> wrote:
>> >
>> > >> > Hey Dave,
>> > >> >   Hopefully Nick will be able to offer some insight into the cause
>> > >> > of
>> > >> > your issues.  I'd guess it is something related to having very few
>> > >> > tasks (one) in thequeue, and it not getting scheduled rapidly.
>> >
>> > >> >   In your case, you could use pull queues to immediately fetch the
>> > >> > nexttaskwhen finished with atask.  Or even to fetch multiple tasks
>> > >> > and do the work in parallel.  Basically you'd have a backend that
>> > >> > ran
>> > >> > a loop (possibly initiated via a pushtask) that would lease atask,
>> > >> > or tasks, from the pullqueue, do the work, delete those tasks, then
>> > >> > repeat from the lease stage.  The cool thing is that if you're, for
>> > >> > example, using URL Fetch to pull data  this might let you do it in
>> > >> > parallel without increasing your costs much (if any).
>> >
>> > >> > Robert
>> >
>> > >> > On Wed, Feb 1, 2012 at 14:25, Dave Loomer <[email protected]>
>> > >> > wrote:
>> > >> > > Here are logs from three consecutivetaskexecutions over the past
>> > >> > > weekend,
>> > >> > > with only identifying information removed. You'll see that
>> > >> > > eachtask
>> > >> > > completes in a few milliseconds, but are 20 seconds apart
>> > >> > > (remember: I've
>> > >> > > already checked myqueueconfigurations, nothing else is running on
>> > >> > > this
>> > >> > > backend, and I later solved the problem by setting countdown=1
>> > >> > > when adding
>> > >> > > thetask).  I don't see any pending latency mentioned.
>> >
>> > >> > > 0.1.0.2 - - [27/Jan/2012:18:33:20 -0800] 200 124 ms=10 cpu_ms=47
>> > >> > > api_cpu_ms=0 cpm_usd=0.000060 queue_name=overnight-tasks
>> > >> > > task_name=15804554889304913211 instance=0
>> > >> > > 0.1.0.2 - - [27/Jan/2012:18:33:00 -0800] 200 124 ms=11 cpu_ms=0
>> > >> > > api_cpu_ms=0
>> > >> > > cpm_usd=0.000060 queue_name=overnight-tasks
>> > >> > > task_name=15804554889304912461
>> > >> > > instance=0
>> > >> > > 0.1.0.2 - - [27/Jan/2012:18:32:41 -0800] 200 124 ms=26 cpu_ms=0
>> > >> > > api_cpu_ms=0
>> > >> > > cpm_usd=0.000060 queue_name=overnight-tasks
>> > >> > > task_name=4499136807998063691
>> > >> > > instance=0
>> >
>> > >> > > The 20 seconds seems to happen regardless of length oftask. Even
>> > >> > > though my
>> > >> > > tasks mostly complete in a couple minutes, I do have cases where
>> > >> > > they take
>> > >> > > several minutes, and I don't see a difference. Of course, when
>> > >> > > atasktakes
>> > >> > > 5-10 minutes to complete, I'm going to notice and care about a
>> > >> > > 20-second
>> > >> > >delaymuch less than when I'm trying to spin through a few tasks in
>> > >> > > a minute
>> > >> > > (which is a real-world need for me as well).
>> >
>> > >> > > When reading up on pull queues a while back, I was a little
>> > >> > > confused about
>> > >> > > where I would use them with my own backends. I definitely could
>> > >> > > see an
>> > >> > > application for offloading work to an AWS Linux instance. But in
>> > >> > > either
>> > >> > > case, could you explain why it might help?
>> >
>> > >> > > I saw you mention in a separate thread how M/S can perform
>> > >> > > differently from
>> > >> > > HRD even in cases where one wouldn't expect to see a difference.
>> > >> > > When I get
>> > >> > > around to it I'm going to create a tiny HRD app and run the same
>> > >> > > tests
>> > >> > > through that.
>> >
>> > >> > > I also wonder if M/S could be responsible for frequent latencies
>> > >> > > in my admin
>> > >> > > console. Those have gotten more frequent and annoying the past
>> > >> > > couple of
>> > >> > > months ...
>> >
>> > >> > > --
>> > >> > > You received this message because you are subscribed to the
>> > >> > > Google Groups
>> > >> > > "Google App Engine" group.
>> > >> > > To view this discussion on the web visit
>> > >> > >https://groups.google.com/d/msg/google-appengine/-/lbNQRQdSx0AJ.
>> >
>> > >> > > To post to this group, send email to
>> > >> > > [email protected].
>> > >> > > To unsubscribe from this group, send email to
>> > >> > > [email protected].
>> > >> > > For more options, visit this group at
>> > >> > >http://groups.google.com/group/google-appengine?hl=en.
>> >
>> > > --
>> > > You received this message because you are subscribed to the Google
>> > > Groups "Google App Engine" group.
>> > > To post to this group, send email to
>> > > [email protected].
>> > > To unsubscribe from this group, send email to
>> > > [email protected].
>> > > For more options, visit this group
>> > > athttp://groups.google.com/group/google-appengine?hl=en.
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Google App Engine" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/google-appengine?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Re: [google-appengine] Re: (mostly) Consistent 20-second delay in starting backend tasks

Reply via email to