Hey Guys,

Thanks for the replies. I am querying descendants as I have about 6-7
different models inherit this one model as a parent. I do not know of
anyway to get just they Keys from that sort of query, although I did
have that before but i used the ancestor() filter.

I'm not sure how many items to fetch before having to worry about a
datastore timeout error, so that's why i picked 100 (I'd rather have
22 cycles of 100 deleted at a time rather than trying to do 2000 at a
time and having the second one fail...)

Thanks again!

-Prateek

On Sep 12, 7:14 am, Alexander Trakhimenok
<[email protected]> wrote:
> Hi someone1,
>
> You code looks good and prerrty optimized.
>
> Delete takes data store CPU time as it needs to update all indexes you
> have on your models.
>
> I see just few areas for improvements.
>
> 1. Try to fetch just keys from the "Tracking" kind as you do not need
> the model instance to delete it or query descendants (you will need to
> use different query for descendants).
> 2. Probably you are fetching more trackers then needed as you can time
> out by DedlineExceededError - not sure how the iteretor over filter
> works as I usualy use fetch().
> 3. In you inner cycle may be it make sense to accumulate as much child
> keys as possible and then call a single db.delete(). As I understand
> you can collect a list of keys from different kinds so you can combine
> the "Tracking" and the descendants for deletion.
>
> But I do not think this will significantly decrease your API CPU time.
> Have no idea how much exactly but my guess would be around 5%.
>
> Alex,http://sharp-developer.net/
>
> On Sep 11, 8:00 pm, someone1 <[email protected]> wrote:
>
> > I have tried asking/researching this before, but I really need a more
> > efficient way to delete mass amounts of data from the datastore. In
> > short, I am only able to remove .1GB for 6.5 hrs of CPU time, and all
> > of that is datastore time.
>
> > Here is the code:
>
> > class DeleteKeywords(webapp.RequestHandler):
> >     def get(self):
> >         try:
> >             trackers = Tracking.all().filter('delete_track',True)
> >             for x in trackers:
> >                 keys = db.query_descendants(x).fetch(100)
> >                 while keys:
> >                     db.delete(keys)
> >                     keys = db.query_descendants(x).fetch(100)
> >                 x.delete()
> >         except DeadlineExceededError:
> >             queue = taskqueue.Queue(name='delete-tasks')
> >             queue.add(taskqueue.Task(url='/tasks/delete_tracks',
> > method='GET'))
> >             self.response.out.write("Ran out of time, need to delete
> > more!")
>
> > Its really small and simple, and I did not think it'd use up soo much
> > CPU time. Why is it that the API CPU time is soo much smaller than the
> > Datastore CPU time? Is there any way to consume more of the datastore
> > time than the API CPU time?
>
> > I'd really like to clear out that database without needing to wait
> > almost 50 hours worth of CPU time (which is odd since it runs for
> > maybe 30 minutes - 1 hour, only 1 task a minute, and it uses up all
> > that time... is it calculating wrong?)
>
> > Anybody have any suggestions?
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to