I am having similar problems using the bulkupdate library, which was sort of a precursor to MapReduce, because bulkupdate iterates over a query instead of fetching, and I've found that to be buggy and unreliable:
http://code.google.com/p/googleappengine/issues/detail?id=4046 Could it be that MapReduce uses the iterator interface to the query (for i in q) instead of fetching batches of entities, which would explain why your custom delete job, which uses fetch, takes less time to complete than the MR job? Pascal On Nov 15, 1:18 am, Eli Jones <[email protected]> wrote: > From what I could tell, the map reduce delete job took up several times more > CPU time (and wall clock time) than my custom delete job usually took. > > My usual utility class uses this method for deletes: > > 1. Create a query for all entities in a model with keys_only = True. > 2. Fetch 100 keys. > 3. Issues a deferred task to delete those 100 key names. > 4. Use a cursor to fetch 100 more, and issue deferred deletes until the > query returns no more entities. > > This is usually pretty fast.. since the only bottle neck is the time it > takes to fetch 100 key names and add the deferred task. The surprising fact > was that the default map reduce delete from the Datastore Admin page took so > much for CPU. -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
