Hey Eli,
  Thanks for the additional info.  I'll have to watch for similar
issues, as we are getting ready to test loading and deleting lots of
data again.

  Just a note, we use self-sizing batches to help speed up the
deletes.  We with batch sizes of 100 then keep stepping it up by 15%
until we hit timeouts, once we hit timeouts we back off in 5%
increments until the timeouts stop.  This helped us out a bunch when
wiping out our largest model.  We find the batch sizes seem to
stabilize around 375 for that model.

Robert








On Sun, Feb 21, 2010 at 1:31 PM, Eli Jones <[email protected]> wrote:
> Do you mean on the Main page of the Dashboard for my app?  No, there was no
> indication of throttling there.
> Mainly, I would get timeouts from the appengine_console.py when running this
> command:
> result = db.GqlQuery("Select __key__ from MyBigModel").fetch(1)
> while fetching another Model would work fine:
> result = db.GqlQuery("Select __key__ from MyRegularModel").fetch(1)
> It looks like I could do .get_by_key_name() explicitly.. but I don't even
> know which key names are still in the Model and I can't View the Datastore
> to find out.  I tried guessing.. but no luck so far.
> Also, I left the automated task running overnight to continue doing
> db.delete() in batches of 100 on the model.. but after 8 hours.. it had
> managed only 1,600 batches.  Which is 200 per hour and about 3 per minute.
> So.. it was only able to manage to db.delete() 300 entities per minute over
> the last 8 hours.
> My guess is that these entities are just sprawled out all over the place..
> and doing a .fetch(100) on them just makes it cry.
> I just gave the datastore an hour and a half break to collect itself.. but
> now it just times out when the task tries to fetch(100).  times out on
> fetch(20)..
> It's a little silly to see that clearing out 78 MB of Entities would bog
> down the datastore this much.. but I guess I can understand since I wasn't
> being very polite to the datastore and just aggressively told it to keep
> deleting over and over.
> I think I've deleted about 410,000 entities so far.. I was able to do about
> 250,000 in batches of 500.. then I had to shift down to batches of 100 and
> that managed to wipe another 160,000.  If I remember correctly, this leaves
> me with about 140,000 left in there somewhere.  But, now I have to wait for
> the Datastore to cooperate.
> Next time, I may try doing db.delete() using pre-generated lists of
> key_names.. the fetch() seems to be what is causing all the trouble.
>
> For those who like logs, here are the DeadlineExceededErrors I would see in
> the log (this would be for when it was trying to do fetch(100) for a "Select
> __key__ from MyBigModel"):
>
>   File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py",
> line 1616, in fetch
>     raw = raw_query.Get(limit, offset, rpc=rpc)
>   File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line
> 1183, in Get
>     limit=limit, offset=offset, prefetch_count=limit, **kwargs)._Get(limit)
>   File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line
> 1110, in _Run
>     datastore_pb.QueryResult(), rpc)
>   File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line
> 176, in _MakeSyncCall
>     rpc.wait()
>   File
> "/base/python_lib/versions/1/google/appengine/api/apiproxy_stub_map.py",
> line 460, in wait
>     self.__rpc.Wait()
>   File "/base/python_lib/versions/1/google/appengine/api/apiproxy_rpc.py",
> line 112, in Wait
>     rpc_completed = self._WaitImpl()
>   File "/base/python_lib/versions/1/google/appengine/runtime/apiproxy.py",
> line 108, in _WaitImpl
>     rpc_completed = _apphosting_runtime___python__apiproxy.Wait(self)
>
> On Sun, Feb 21, 2010 at 10:12 AM, Robert Kluin <[email protected]>
> wrote:
>>
>> Hi Eli,
>>  Did you happen to look at the App Engine Console after the datastore
>> was "napping"?  A few months ago when clearing a test datastore we hit
>> a similar thing.  When we looked at the App Engine Console it said the
>> app was being temporarily throttled.
>>
>>  Was just curious if that is what you encountered too?
>>
>> Robert
>>
>>
>>
>>
>>
>>
>> On Sat, Feb 20, 2010 at 11:49 PM, Eli Jones <[email protected]> wrote:
>> > As a side note.. once I hit about 250,000 entities deleted it seems the
>> > datastore took a nap on me..
>> > So, now I'm waiting for it to finish whatever it is doing underneath
>> > before
>> > I can continue deleting.
>> > Took a nap = db.delete() or .fetch(1) from the model times out.
>> > Though, I can .fetch() from my other Models just fine.  I figure it is
>> > just
>> > shuffling the data around and merging it to new tablets.
>> > Granted.. the Model has been unavailable (I'm just judging this by
>> > seeing if
>> > I can do a .fetch(1) using the appengine_console from my local) for 40
>> > minutes.. which is much longer than the brief period of time mentioned
>> > for
>> > tablet unavailability here:
>> > http://code.google.com/appengine/articles/handling_datastore_errors.html
>> > Ah.. seems I still can't .fetch from appengine_console, but I can sort
>> > of
>> > limp along and delete 100 at a time (but it takes about 16 seconds to
>> > delete
>> > 100 entities now) using the task I have set up to do this.
>> >
>> > On Sat, Feb 20, 2010 at 9:54 PM, Eli Jones <[email protected]> wrote:
>> >>
>> >> I am currently going through the process of deleting 500,000 entities
>> >> from
>> >> my datastore.
>> >> Here are the different stats I have so far
>> >> db.delete() for:
>> >> 100 entities = 2,179 API_CPU
>> >> 200 entities = 4,345 API_CPU
>> >> 500 entities = 10,845 API_CPU
>> >> So.. it doesn't seem like you get better per entity API_CPU for
>> >> deleting
>> >> more at once.  It seems to average about 21 API_CPU per entity deleted.
>> >> There doesn't even really seem to be a general time benefit either.  It
>> >> seems to average about 1 to 2 seconds per 100 entities deleted.
>> >> On Sat, Feb 20, 2010 at 3:21 AM, kang <[email protected]> wrote:
>> >>>
>> >>> I'm going to clear the datastore. I use the following code:
>> >>>     old_date = datetime.datetime(2009,10,1)
>> >>>     old_updates = SomeUpdate.all().filter("updated
>> >>> <",old_date).fetch(20)
>> >>>     db.delete(old_updates)
>> >>> it costs me nearly 1982cpu_ms 1945api_cpu_ms every time. Is it normal?
>> >>>
>> >>> --
>> >>> Stay hungry,Stay foolish.
>> >>>
>> >>> --
>> >>> You received this message because you are subscribed to the Google
>> >>> Groups
>> >>> "Google App Engine" group.
>> >>> To post to this group, send email to
>> >>> [email protected].
>> >>> To unsubscribe from this group, send email to
>> >>> [email protected].
>> >>> For more options, visit this group at
>> >>> http://groups.google.com/group/google-appengine?hl=en.
>> >>
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> > Groups
>> > "Google App Engine" group.
>> > To post to this group, send email to [email protected].
>> > To unsubscribe from this group, send email to
>> > [email protected].
>> > For more options, visit this group at
>> > http://groups.google.com/group/google-appengine?hl=en.
>> >
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Google App Engine" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/google-appengine?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to