I haven't looked too closely at your problem, but one thing that has come up before on this list is that it's a bad idea to do things like this:
repeat { delete N items; } Basically, deleting items just flags them as deleted in the underlying store. They are vacuumed up later. So if you delete a lot of stuff like this it ends up being a slow O(N^2) operation as you're effectively doing an offset over all the previously deleted items. To efficiently delete large numbers of entities, either delete them all in a single request or use a cursor. Of course, this information may be out of date. Jeff On Thu, Dec 8, 2011 at 8:22 PM, Michael <michael.ol...@gmail.com> wrote: > I have an hourly cron job that deletes data from the datastore older than > one month (this data is archived elsewhere for long term storage). The > first run of that cron job on Tuesday after the datastore came back up > behaved quite unusually, ended with "java.lang.OutOfMemoryError: Java heap > space", and hasn't completed once since then. While it is possible that > this is pure coincidence, I'm wondering if something done during the > maintenance resulted in this behavior. I have been unable to get this cron > job to run correctly since then. > > The job is quite simple, and has been running happily for about a year; I > will present the idea here for brevity and attach the source for those > interested. In essence the job does the following: > - retrieve at most 500 entities older than 1 month by their keys only > - send all resulting keys as a list to datastore.delete() > - repeat until no results are returned > > The first run after maintenance produced the attached log-excerpt.txt. > The brief version is the following: > - deleted 500 objects > - deleted 465 objects > - deleted 213 objects (repeated 395 times) > - out of memory > > It seems that, after actually deleting the first 752 objects of the query, > the datastore got stuck on the next 213. The same 213 objects were sent > repeatedly to datastore.delete(). No exceptions were generated, but the > data was obviously not deleted. > > The next attempt (the job was retried since it crashed) produced almost > identical output. This time, it actually deleted 174 objects, then tried > to delete the same 213 objects over and over until it, too, crashed with an > OutOfMemoryError. The run after that actually deleted 8 objects before it > crashed in the same manner. This continued until the error ran my > application out of quota for the day, at which point I got a notification > email and went to go pause the queue that these jobs run under. > > Note, I am not on the high replication datastore. I do not know why this > is happening, but it is currently an insurmountable obstacle. I tried > unpausing the queue temporarily and running the problematic job, and this > time I did not even get the previously frustrating but informative output; > instead, I merely got the "A serious problem was encountered . . ." message > on both runs. > > Any help in getting this fixed or understanding the problem would be > greatly appreciated. > > -- > You received this message because you are subscribed to the Google Groups > "Google App Engine for Java" group. > To view this discussion on the web visit > https://groups.google.com/d/msg/google-appengine-java/-/AU89aTTQOR8J. > To post to this group, send email to > google-appengine-java@googlegroups.com. > To unsubscribe from this group, send email to > google-appengine-java+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/google-appengine-java?hl=en. > -- We are the 20% -- You received this message because you are subscribed to the Google Groups "Google App Engine for Java" group. To post to this group, send email to google-appengine-java@googlegroups.com. To unsubscribe from this group, send email to google-appengine-java+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/google-appengine-java?hl=en.