Thanks Sean,

I just want to point point out that if string_to_int is crashing, then it seems 
like it should have affected the first example (using the map phase only), and 
that one works and doesn't log any errors.

Also, I thought that the default value for the keep field is false, except for 
the final phase.  From (http://wiki.basho.com/MapReduce.html):

Omitting the keep field accepts its default value, which is false for all 
phases except the final phase (Riak assumes that you were most interested in 
the results of the last phase of your map/reduce query).

I'll keep experimenting and see what I can come up with.  But so far in all of 
my tests it seems like you pretty much always have to have at least one map 
phase to keep mapred happy.

--gordon


On Jan 25, 2011, at 08:57, Sean Cribbs wrote:

I'm not sure why that's crashing (I suspect it's string_to_int on 0-prefixed 
numbers), but your phase has to have "keep":true to return any data to the 
client.

Sean Cribbs <[email protected]<mailto:[email protected]>>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On Jan 25, 2011, at 9:53 AM, Gordon Tillman wrote:

Sean thanks again for the feedback.

Using just a reduce function seems to cause Riak problems (unless it's me doing 
something wrong).

For example, I populated a bucket called "junk" with with the following command:

for s in `seq 00000 10000`; do curl -X POST -H 'Content-Type: text/plain' 
http://localhost:8091/riak/junk/$s -d 0; done

Now if I try and select some keys using key filters and a map function it works:

curl -X POST -H 'Content-Type: application/json' http://localhost:8091/mapred 
[email protected]<mailto:[email protected]>
["5","25","37","10", ... ,"18","22","17"]

where kfm.json is:

{
    "inputs": {
        "bucket": "junk",
        "key_filters": [ ["string_to_int"], ["less_than", 100] ]
    },
    "query": [
        {
            "map": {
                "language": "javascript",
                "source": "function(v, a) { return [v.key]; }"
            }
        }
    ]
}

But if I try the same thing with just a reduce phase like this:

curl -X POST -H 'Content-Type: application/json' http://localhost:8091/mapred 
[email protected]<mailto:[email protected]> -i

where kfr.json is:

{
    "inputs": {
        "bucket": "junk",
        "key_filters": [ ["string_to_int"], ["less_than", 100] ]
    },
    "query": [
        {
            "reduce": {
                "language": "javascript",
                "source": "function(v, a) { return v; }"
            }
        }
    ]
}

The curl command just hangs and I see this in the logs:

=CRASH REPORT==== 25-Jan-2011::08:46:11 ===
  crasher:
    initial call: riak_kv_keys_fsm:init/1
    pid: <0.26942.19>
    registered_name: []
    exception exit: badmsg
      in function  gen_fsm:terminate/7
      in call from proc_lib:init_p_do_apply/3
    ancestors: [<0.26941.19>]
    messages: 
[{EXIT,<0.26942.19>,normal},{EXIT,<0.26942.19>,normal},{$gen_event,{66195534,{kl,1278813932664540053428224228626747642198940975104,[<<"83">>,<<"9">>]}}},{$gen_event,{66195534,{kl,1278813932664540053428224228626747642198940975104,[<<"76">>]}}},{$gen_event,{66195534,{kl,1278813932664540053428224228626747642198940975104,[<<"95">>]}}},{$gen_event,{66195534,{kl,1278813932664540053428224228626747642198940975104,[<<"61">>]}}},{$gen_event,{66195534,1278813932664540053428224228626747642198940975104,done}}]
    links: []
    dictionary: []
    trap_exit: true
    status: running
    heap_size: 2584
    stack_size: 24
    reductions: 94951
  neighbours:

--gordon

On Jan 25, 2011, at 07:24, Sean Cribbs wrote:

Use a reduce phase instead, which doesn't force loading of the objects.  A 
simple identity reduce should do what you want: function(values,arg){ return 
values; }

Sean Cribbs <[email protected]<mailto:[email protected]>>
Developer Advocate
Basho Technologies, Inc.
http://basho.com/

On Jan 24, 2011, at 7:43 PM, Gordon Tillman wrote:

Greetings All,

I have a use case for our app where I need to fetch a list of keys that match 
some pattern and was hoping to be able to use key filters for that.

In my test I defined a key filter for the input phase of mapred and then 
defined just a single map phase that returns the object key.   But there is 
considerable overhead with that map phase because (I'm assuming this part) Riak 
is having to load each object to provide the necessary inputs to the map 
function.

Is there a way to do this without Riak having to actually load the objects?

Many thanks,

--gordon
_______________________________________________
riak-users mailing list
[email protected]<mailto:[email protected]>
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com




_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to