Hey Guys, Thanks for the replies. I am querying descendants as I have about 6-7 different models inherit this one model as a parent. I do not know of anyway to get just they Keys from that sort of query, although I did have that before but i used the ancestor() filter.
I'm not sure how many items to fetch before having to worry about a datastore timeout error, so that's why i picked 100 (I'd rather have 22 cycles of 100 deleted at a time rather than trying to do 2000 at a time and having the second one fail...) Thanks again! -Prateek On Sep 12, 7:14 am, Alexander Trakhimenok <[email protected]> wrote: > Hi someone1, > > You code looks good and prerrty optimized. > > Delete takes data store CPU time as it needs to update all indexes you > have on your models. > > I see just few areas for improvements. > > 1. Try to fetch just keys from the "Tracking" kind as you do not need > the model instance to delete it or query descendants (you will need to > use different query for descendants). > 2. Probably you are fetching more trackers then needed as you can time > out by DedlineExceededError - not sure how the iteretor over filter > works as I usualy use fetch(). > 3. In you inner cycle may be it make sense to accumulate as much child > keys as possible and then call a single db.delete(). As I understand > you can collect a list of keys from different kinds so you can combine > the "Tracking" and the descendants for deletion. > > But I do not think this will significantly decrease your API CPU time. > Have no idea how much exactly but my guess would be around 5%. > > Alex,http://sharp-developer.net/ > > On Sep 11, 8:00 pm, someone1 <[email protected]> wrote: > > > I have tried asking/researching this before, but I really need a more > > efficient way to delete mass amounts of data from the datastore. In > > short, I am only able to remove .1GB for 6.5 hrs of CPU time, and all > > of that is datastore time. > > > Here is the code: > > > class DeleteKeywords(webapp.RequestHandler): > > def get(self): > > try: > > trackers = Tracking.all().filter('delete_track',True) > > for x in trackers: > > keys = db.query_descendants(x).fetch(100) > > while keys: > > db.delete(keys) > > keys = db.query_descendants(x).fetch(100) > > x.delete() > > except DeadlineExceededError: > > queue = taskqueue.Queue(name='delete-tasks') > > queue.add(taskqueue.Task(url='/tasks/delete_tracks', > > method='GET')) > > self.response.out.write("Ran out of time, need to delete > > more!") > > > Its really small and simple, and I did not think it'd use up soo much > > CPU time. Why is it that the API CPU time is soo much smaller than the > > Datastore CPU time? Is there any way to consume more of the datastore > > time than the API CPU time? > > > I'd really like to clear out that database without needing to wait > > almost 50 hours worth of CPU time (which is odd since it runs for > > maybe 30 minutes - 1 hour, only 1 task a minute, and it uses up all > > that time... is it calculating wrong?) > > > Anybody have any suggestions? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en -~----------~----~----~----~------~----~------~--~---
