Riak 1.4 - fastest way to count all records in bucket (100+ millions)

Christian Rosnes Wed, 31 Jul 2013 00:56:28 -0700

Hi,

I have 4 node Riak 1.4 test cluster on Azure
(Large: 4core, 7GB RAM instances).


I'm wondering if the script below is the fastest way
to do a full count of all the records in a bucket in Riak 1.4?
Or is there some other way I could try that could be faster?

Also, are there any parameters in the configuration files
that can influence the speed of this type of "large"
erlang map reduce job?


Thanks.

Christian
@NorSoulx

--

riak01$ ./count.all.records.in.bucket.sh

Counting all records in bucket: entries (Wed Jul 31 05:32:36 UTC 2013)

[109 542 663]

real    116m7.132s
user    0m0.000s
sys     0m0.376s

Done: Wed Jul 31 07:28:43 UTC 2013

Script: count.all.records.in.bucket.sh
--------------------------------------
time curl -XPOST http://localhost:8098/mapred \
  -H 'Content-Type: application/json' \
  -d '{"inputs":{
           "bucket":"entries",
           "index":"$bucket",
           "key":"entries"
       },
       "query":[{"reduce":{"language":"erlang",
                           "module":"riak_kv_mapreduce",
                           "function":"reduce_count_inputs",
                           "arg":{"do_prereduce":true}
                          }
               }],
       "timeout": 90000000}'

_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Riak 1.4 - fastest way to count all records in bucket (100+ millions)

Reply via email to