That's some great insight. I wonder if this is why occasionally someone posts saying they have this large amount of datastore usage and insist that their database statistics only shows 6MB or something like that. I would assume that statistics might show the smaller amount whereas the datastore would show the larger amount until the deleted entities are actually garbage collected. Also, the query insight is most useful. It goes to show that one really needs a good understanding of how the app engine architecture works.
On Sun, Nov 21, 2010 at 1:12 PM, Robert Kluin <[email protected]>wrote: > There are a number of posts where Googlers have mentioned how deletes > are performed. Here is a recent one, > > http://groups.google.com/group/google-appengine-python/msg/598a671cfef98fb4 > > So they already use some type of soft-delete + later batch cleanup > approach. > > > > Robert > > > > > > > > On Sun, Nov 21, 2010 at 13:50, Stephen Johnson <[email protected]> > wrote: > > I think the biggest thing for everyone to remember is that each delete of > an > > entity also requires updating all indexes for that entity. So for 1 to 2 > > million entities that's a lot of work especially with transactions, > > concurrency, etc. I'm not sure that marking the entity for deletion and > then > > having some background task later come through actually saves anything > > because all that work still needs to be done. Also, would those entities > > still show up in queries. If they shouldn't show up then you'd have to go > > and mark all index entries as deleted which is just as much work as > deleting > > them in the first place. I would say however that perhaps there should be > > the equivalent of a DROP TABLE so that you can get rid of an entire > entity > > kind. That could definitely be done in the background using bulk delete > > methods and would immediately remove that entity kind from being visible > and > > used and as such there wouldn't be the need to worry about concurrency > and > > transaction isolation etc. > > > > On Sun, Nov 21, 2010 at 12:10 AM, Derrick Schneider > > <[email protected]> wrote: > >> > >> Thanks for the reminder about the thread. I had read the initial couple > of > >> posts but hadn't noticed the follow-ups. > >> My procedure is pretty straightforward, and is similar to the "fast" one > >> described: Kick off a task that does a keys-only query for n items and > then > >> deletes them (using the form where you just pass a list of keys), grabs > the > >> cursor, and kicks off another task that pulls in the cursor and gets n > more, > >> and so on. (I forget the value of n.) > >> > >> I don't know exactly how many items were being deleted, but I've got > ~7.5 > >> million entries in total, spread from July to today. Probably our > largest > >> number of entries actually comes from August, but I was just deleting > July. > >> So I guess "few hundred thousand" isn't accurate, and it was probably > closer > >> to 1-2 million. Hm. > >> But reading that Bulk Delete thread, I agree it would be nice to mark an > >> item as deletable and have some cheaper, belly-of-Google task sweep > through > >> them. > >> Derrick > >> On Sat, Nov 20, 2010 at 8:55 PM, Stephen Johnson < > [email protected]> > >> wrote: > >>> > >>> Read post Bulk Deletion Woe that was posted this past week. It > discusses > >>> this. You should add what procedure you did to delete your entities, > how > >>> many etc. > >>> > >>> On Sat, Nov 20, 2010 at 9:21 PM, Derrick Schneider > >>> <[email protected]> wrote: > >>>> > >>>> One thing I've noticed as I'm purging older items in the datastore is > >>>> that deletes are really CPU intensive. Granted, I was probably > deleting a > >>>> few hundred thousand entries, but I went from well under my free CPU > >>>> threshold for the day to burning out the rest of our budgeted 14.5 > hours > >>>> over the course of about 40 minutes. > >>>> Has anyone else noticed this? > >>>> Derrick > >>>> -- > >>>> Writer. Programmer. Puzzle Designer. > >>>> http://www.obsessionwithfood.com > >>>> > >>>> -- > >>>> You received this message because you are subscribed to the Google > >>>> Groups "Google App Engine" group. > >>>> To post to this group, send email to > [email protected]. > >>>> To unsubscribe from this group, send email to > >>>> [email protected]<google-appengine%[email protected]> > . > >>>> For more options, visit this group at > >>>> http://groups.google.com/group/google-appengine?hl=en. > >>> > >>> -- > >>> You received this message because you are subscribed to the Google > Groups > >>> "Google App Engine" group. > >>> To post to this group, send email to [email protected] > . > >>> To unsubscribe from this group, send email to > >>> [email protected]<google-appengine%[email protected]> > . > >>> For more options, visit this group at > >>> http://groups.google.com/group/google-appengine?hl=en. > >> > >> > >> > >> -- > >> Writer. Programmer. Puzzle Designer. > >> http://www.obsessionwithfood.com > >> > >> -- > >> You received this message because you are subscribed to the Google > Groups > >> "Google App Engine" group. > >> To post to this group, send email to [email protected]. > >> To unsubscribe from this group, send email to > >> [email protected]<google-appengine%[email protected]> > . > >> For more options, visit this group at > >> http://groups.google.com/group/google-appengine?hl=en. > > > > -- > > You received this message because you are subscribed to the Google Groups > > "Google App Engine" group. > > To post to this group, send email to [email protected]. > > To unsubscribe from this group, send email to > > [email protected]<google-appengine%[email protected]> > . > > For more options, visit this group at > > http://groups.google.com/group/google-appengine?hl=en. > > > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<google-appengine%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/google-appengine?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
