I haven't looked too closely at your problem, but one thing that has come
up before on this list is that it's a bad idea to do things like this:

repeat {
   delete N items;
}

Basically, deleting items just flags them as deleted in the underlying
store.  They are vacuumed up later.  So if you delete a lot of stuff like
this it ends up being a slow O(N^2) operation as you're effectively doing
an offset over all the previously deleted items.

To efficiently delete large numbers of entities, either delete them all in
a single request or use a cursor.

Of course, this information may be out of date.

Jeff

On Thu, Dec 8, 2011 at 8:22 PM, Michael <michael.ol...@gmail.com> wrote:

> I have an hourly cron job that deletes data from the datastore older than
> one month (this data is archived elsewhere for long term storage).  The
> first run of that cron job on Tuesday after the datastore came back up
> behaved quite unusually, ended with "java.lang.OutOfMemoryError: Java heap
> space", and hasn't completed once since then.  While it is possible that
> this is pure coincidence, I'm wondering if something done during the
> maintenance resulted in this behavior.  I have been unable to get this cron
> job to run correctly since then.
>
> The job is quite simple, and has been running happily for about a year; I
> will present the idea here for brevity and attach the source for those
> interested.  In essence the job does the following:
> - retrieve at most 500 entities older than 1 month by their keys only
> - send all resulting keys as a list to datastore.delete()
> - repeat until no results are returned
>
> The first run after maintenance produced the attached log-excerpt.txt.
>  The brief version is the following:
> - deleted 500 objects
> - deleted 465 objects
> - deleted 213 objects (repeated 395 times)
> - out of memory
>
> It seems that, after actually deleting the first 752 objects of the query,
> the datastore got stuck on the next 213.  The same 213 objects were sent
> repeatedly to datastore.delete().  No exceptions were generated, but the
> data was obviously not deleted.
>
> The next attempt (the job was retried since it crashed) produced almost
> identical output.  This time, it actually deleted 174 objects, then tried
> to delete the same 213 objects over and over until it, too, crashed with an
> OutOfMemoryError.  The run after that actually deleted 8 objects before it
> crashed in the same manner.  This continued until the error ran my
> application out of quota for the day, at which point I got a notification
> email and went to go pause the queue that these jobs run under.
>
> Note, I am not on the high replication datastore.  I do not know why this
> is happening, but it is currently an insurmountable obstacle.  I tried
> unpausing the queue temporarily and running the problematic job, and this
> time I did not even get the previously frustrating but informative output;
> instead, I merely got the "A serious problem was encountered . . ." message
> on both runs.
>
> Any help in getting this fixed or understanding the problem would be
> greatly appreciated.
>
> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine for Java" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine-java/-/AU89aTTQOR8J.
> To post to this group, send email to
> google-appengine-java@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengine-java+unsubscr...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine-java?hl=en.
>



-- 
We are the 20%

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine for Java" group.
To post to this group, send email to google-appengine-java@googlegroups.com.
To unsubscribe from this group, send email to 
google-appengine-java+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/google-appengine-java?hl=en.

Reply via email to