Cool, excellent starting point, Robin! Some additions in the same vein: dot (and other binary ops) with two different impls (with both impl1 as caller on impl2 as method param, and vice-versa), looking at effects of sparsity are, and create incrementally (with n different set() operations).
-jake On Feb 17, 2010 1:24 AM, "Robin Anil" <robin.a...@gmail.com> wrote: Its checked in under utils org.apache.mahout.benchmark.VectorBenchmarks. It current runs on full vectors 0-cardinality only create, clone and dot is benchmarked All distance measures are benchmarked where each unit is k = numOps times the time taken to calculate distance measure between 2 vectors this is to mimic kmeans and other clustering. It prints out the number of vectors processed and the number of megabytes read(to mimic the speed at which a dataset could be processed) I know a lot of assumptions could be wrong. So please feel free to modify.. An output for cardinality = 1000, numVectors=100, loop = 200, numOps = 10 http://pastebin.com/f1b687091