I'm pretty sure mapreduce already allows keys only with
the DatastoreKeyInputReader.

http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/mapreduce/input_readers.py#377

Granted, I don't use mapreduce since I'm so used to writing my own code to
fire off tasks to do something with batches of entities.. but it seems that
the documentation for mapreduce is so light.. that many people don't really
understand how the API works.

As someone mentioned in the "Bulk Deletion Woes" thread... the Datastore
Admin deleter uses the DatastoreKeyInputReader:

http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/datastore_admin/delete_handler.py#148

So.. maybe some insights could be gleaned about mapreducing with keys_only
there.
<http://code.google.com/p/googleappengine/source/browse/trunk/python/google/appengine/ext/mapreduce/input_readers.py#377>


On Mon, Nov 29, 2010 at 3:20 PM, Ikai Lan (Google)
<[email protected]<ikai.l%[email protected]>
> wrote:

> Thanks for the feedback Stephen. Yeah, I can see where having an option to
> not return the Entity could be useful.
>
> Yep, Stephen's way is the way to go (we talked about this at lunch the
> other day). Create a new set of Entities with random keys that point to the
> original entities, then randomly select a few of these based on key.
>
> --
> Ikai Lan
> Developer Programs Engineer, Google App Engine
> Blogger: http://googleappengine.blogspot.com
> Reddit: http://www.reddit.com/r/appengine
> Twitter: http://twitter.com/app_engine
>
>
>
> On Sat, Nov 27, 2010 at 10:27 AM, Stephen Johnson 
> <[email protected]>wrote:
>
>> To extend on Tim's comments. I find that map reduce can be your best
>> friend in a lot of situations. Why not use map reduce to pre-generate random
>> test lists of keys. Then you're only going through all the keys every so
>> often and not every time you need a new test. Create a new Entity kind to
>> hold these pre-generated lists and then when you need a new test grab the
>> first one from the pre-generated list of keys and then delete it (or just
>> mark it as used if you want a historical record of the test). Have your map
>> reduce job create as many as you think you'll need over a certain period and
>> then just have a task to that runs the map reduce job every so often.
>>
>> As a side note to any of the Googlers that maintain and update map-reduce,
>> this is another example of where it would be nice to have an option of
>> map-reduce only returning keys and not also the entire entity because as in
>> this example there is no need to incur the CPU cost of returning the entire
>> entity. This was discussed also in the Bulk Deletion Woe topic a couple of
>> weeks ago.
>>
>>
>> On Fri, Nov 26, 2010 at 10:44 PM, Tim Hoffman <[email protected]> wrote:
>>
>>> Hi
>>>
>>> Do you know your key structure and general distribution?
>>>
>>> If for instance you keys are ints, you could randomly choose a key
>>> do a > query and grab the first past that value or a random number
>>> past the first one,
>>> that way you only need to return a few keys/entities.
>>>
>>> To make it more likely to get hits, your could have a task queue that
>>> maps the
>>> keys ranges, then your random choice can come from a set of values
>>> more likely to give you an entity.
>>>
>>> Just a thought ;-)
>>>
>>> T
>>>
>>> On Nov 26, 11:35 pm, lein <[email protected]> wrote:
>>> > hello,
>>> >
>>> > i have a table of questions that has many rows. every now and then i
>>> > would have to generate a quiz by randomly selecting questions from
>>> > this table. making a key only query and randomly selecting the
>>> > question keys from the resulting list of keys, takes a lot of CPU
>>> > resources. any suggestions on how i could better do this?
>>> >
>>> > thanks
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups
>>> "Google App Engine" group.
>>> To post to this group, send email to [email protected].
>>> To unsubscribe from this group, send email to
>>> [email protected]<google-appengine%[email protected]>
>>> .
>>> For more options, visit this group at
>>> http://groups.google.com/group/google-appengine?hl=en.
>>>
>>>
>>  --
>> You received this message because you are subscribed to the Google Groups
>> "Google App Engine" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected]<google-appengine%[email protected]>
>> .
>> For more options, visit this group at
>> http://groups.google.com/group/google-appengine?hl=en.
>>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected]<google-appengine%[email protected]>
> .
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/google-appengine?hl=en.

Reply via email to