Hi,
Below is my code for running a map reduce in python. I have a six
node cluster, 2 cores each with 4 gigs for ram. I am no load and
about 3 Mill keys and using leveldb with riak 1.2. Doing the below
is taking a terribly long time. Never finished and I dont even know
how I can check if it is even running other than the python script has
not timed out. I look at the number of executed mappers in stats and
it is flat lined when looking at Graphite. On test queries the below
works.
So..how do I debug what is going on?
def main():
client =
riak.RiakClient(host=riak_host,port=8087,transport_class=riak.transports.pbc.RiakPbcTransport)
query = client.add(bucket)
filters = key_filter.tokenize(":", filter_map['date']) +
(key_filter.starts_with('201210'))
#& key_filter.tokenize(":", filter_map['country']).eq("US") \
#& key_filter.tokenize(":", filter_map['campaign_id']).eq("t1") \
query.add_key_filters(filters)
query.map('''
function(value, keyData, arg) {
var data = Riak.mapValuesJson(value)[0];
if(data['adx']=='gdn'){
var alt_key = data['hw'];
var obj = {};
obj[alt_key] = 1;
return [ obj ];
}else{
return [];
}
}''')
query.reduce('''
function(values, arg){
return [ values.reduce( function(acc, item) {
for (var state in item) {
if (acc[state])
acc[state] += item[state];
else
acc[state] = item[state];
}
return acc;
})];
}
''')
for result in query.run(timeout=300000):
print result
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com