[google-appengine] Re: DataStore Deadline Exceeded

Paul Kinlan Wed, 27 May 2009 23:10:42 -0700

Hi Jon

(I Hit Send Before finishing - sorry for the dupe, I need to structure the
email)

I appreciate the look and the feedback, other than the index, the timeouts
really only started occuring recently so I am presuming they were caused by
long URL fetches (which also only seemed to have occurred recently - the
twitter api hasn't been slow from another service I run)

For sure, the datastore timeouts may be at the end of the script, but now I
am only really seeing errors that escape the retry logic - which explain the
some of the DeadlineExceeded Error, however I have recently seen far more
errors than I normally see - hence the email.  The inserts are the end of
the script because I was trying to be performant by using a batch put, the
problem is I need the db.put to happen as soon as the urlfetch completes (I
need to account for it) so I might have to move it back out of the batch
put.

I think I am going have to have an intermediate work queue for the entity in
question.  Is there any news on the "Google Queue" that I was pretty sure I
saw in a roadmap somewhere, if you need beta testers you have one here who
as a lot of volume to put through it, or if you have any information I would
love to see if it fits my needs.  The issue I will have here, and it is one
that I need to solve, is that if I queue each action that my app needs to
take (1000s a minute) I am not overly keen on having to execute web requests
- obviously if I have to dequeue 1000's of items a minute, a cron schedule
will not help because it will only do one dequeue a minute (and I can't
batch de-queue because I will be in the same problem now of performing
multple webrequests and db.puts in one handler call)

So the application at the moment (and it is no secret to how it works) is as
follows:

   1. Mac Mini Asks what work there is to do on 1 of 10 "pseduo" threads - I
   partition the data in the App Engine so I can thread the processing on my
   Mac.
      1. Produce a list of user (30)

My Mac Mini - consumes the above "queue" and for each line on each "thread"
it will:

   1. Search Twitter
   2. For The First Three Results
      1. Follow the Person
      2. Put Follow into List
   3. DB Put Follow List

If a queue was in place the second part would be split in to two (or more)
and would possibly look like:

   1. Search Twitter
   2. For The First Three Results
      1. Enqueue follow information in "FollowQueue"

   1. Dequeue Item From FollowQueue
      1. Follow User
      2. DB PUT "Follow" Results

If the above "architecture" ever comes off, I don't want to be in
the situation where I have to use my Mac Mini to call "Dequeue Item From
FollowQueue" a million times to process the data.  This is the reason why I
am reticent to do my own queue, I can't batch the work (it could timeout on
the app engine) and I don't want to call a web endpoint a million times -
bearing in mind my mac mini is making 10 * 30 + 10 (310) requests a minute
all ready(ish) to orchestrate the system.

On the index building side of things, I suspected that table was very big
and probably takes up most of the 30GB, however because the way I designed
my app and the "unfortunate" problem is that it became popular quite quickly
and I can't change it easily is that I don't know how many items are in that
table, how big the table is, how big the indexes are - from an application
point of view there is no easy way for me to work out what to delete, unless
I iterate over everything and delete the items that no longer work.

Let this be a lesson to everyone:  If you have a model that has a
ReferenceProperty and that reference is deleted you need to work out a way
of deleting the items that reference that delete item.

My model is as follows:
User:
   Username etc etc

Follow:
   FollowingName etc
   User ReferenceProperty

The model may have 6000 Follows pointing to one User Model - when a user
leaves the service I delete the user reference, however this will leave 6000
Follow items which I now have no easy way of finding - the UserReference no
longer exists so I can query for it.  I have to resort to iterating over 50
records at a time and deleting those elements that error when I try to
access the User ReferenceProperty.  Just as another not, I don't think I
have ever had this run to anything about 1% completion.

Thanks,
Paul

2009/5/28 Paul Kinlan <[email protected]>

> Hi Jon
>
> I appreciate the look and the feedback, other than the index the timeouts
> really only started occuring recently so I am presuming they were caused by
> long URL fetches (which also only seemed to have occurred recently - the
> twitter api hasn't been slow from another service I run)
>
> For sure, the datastore timeouts may be at the end, but now I am only
> really seeing errors because I put a lot of retry logic inplace.  The
> inserts are the end of the script because I was trying to be performant by
> using a batch put.
>
> I think
>
>
>
>
> On 28 May 2009, at 00:16, Jon McAlister <[email protected]> wrote:
>
>
>> Hi Paul,
>>
>> The reason the index building has taken this long for you is just
>> because of the amount of data your app has in the datastore. However,
>> our index building workflow is not very loaded right now so I reset
>> your "Workflow Backend Index Task Count" quota level and it should go
>> a bit faster now. Afaict, it's been building now for 12 hours, but it
>> is now about 80% done. A big feature I'd like to see done is making
>> this data available in the admin console so you can track the progress
>> more than just the "Building|Serving|Error" we show now. Hopefully we
>> should have that out soon.
>>
>> Also, about the datastore timeouts, I took a brief glance at your
>> error logs. It appears as if you are only starting the datastore
>> request near the end of the 30-second request deadline. It looks like
>> you do some urlfetch calls and those sometimes take up to 30 seconds,
>> and then you kick off one datastore call and it deadlines. Have a look
>> at the timestamps in the logs and you'll see what I mean. I'm sure
>> this doesn't account for all of your datastore timeouts, but I did see
>> a fair amount of that going on.
>>
>> Jon
>>
>> On Wed, May 27, 2009 at 11:33 AM, Paul Kinlan <[email protected]>
>> wrote:
>>
>>> Hi Guys.
>>>
>>> AppId: Twitterautofollow.
>>>
>>> I have three main issues at the moment.
>>>
>>> I am seeing an insane number of Datastore timeouts at the moment reads
>>> and
>>> writes - this has been occurring for a couple of days now?
>>> I understand there have been issues with UrlFetch yesterday. But I am
>>> still
>>> seeing a massive number of issues with requests timing out.
>>> I created one new index today and now I am seeing "Your application is
>>> exceeding a quota: Workflow Backend Index Task Count"  It is a relativly
>>> simple index that includes a __key__ property.  The object in question
>>> has
>>> no List Properties.
>>>
>>> I am getting increasingly frustrated at the moment, especially since I am
>>> paying a relativly significant amount each month now.
>>>
>>> Paul.
>>>
>>>
>>>>
>>>
>> >>
>>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

[google-appengine] Re: DataStore Deadline Exceeded

Reply via email to