[
https://issues.apache.org/jira/browse/MAHOUT-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robin Anil updated MAHOUT-1191:
-------------------------------
Attachment: MAHOUT-1191.patch
I refactored the benchmark suite to make it much easier to add a new benchmark.
Also made sure there is a data dependency which prevents JVM from optimizing.
Eg. of a benchmark that returns a double value.
{noformat}
mark.printStats(mark.getRunner().benchmarkD(new BenchmarkFnD() {
@Override
public Double apply(Integer i) {
return
mark.vectors[0][mark.vIndex(i)].dot(mark.vectors[0][mark.vIndex(randIndex())]);
}
}), DOT_PRODUCT, DENSE_VECTOR);
{noformat}
Eg. of a benchmark that returns no value
{noformat}
mark.printStats(mark.getRunner().benchmark(new BenchmarkFn() {
@Override
public Boolean apply(Integer i) {
mark.vectors[0][mark.vIndex(i)] =
mark.vectors[0][mark.vIndex(i)].clone();
return depends(mark.vectors[0][mark.vIndex(i)]);
}
}), CLONE, DENSE_VECTOR);
{noformat}
Changing the operations so that it happens at Random brings back the difference
between SASV and RASV. Here is an updated benchmark with the new code. It
should feel more like what you were expecting. There is a 47% gap in dot
product with SASV against RASV when done at random.
{noformat}
BenchMarks DenseVector RandSparseVector
SeqSparseVector Clusters Dense.fn(Rand)
Dense.fn(Seq) Rand.fn(Dense) Rand.fn(Seq)
Seq.fn(Dense) Seq.fn(Rand) Closest C w/o Elkan's trClosest
C w/ Elkan's tri
Clone
nCalls = 262; nCalls = 13641; nCalls
= 54125;
sum = 0.1s; sum = 0.1s; sum
= 0.1s;
min = 0.28ms; min = 0ms; min
= 0ms;
max = 13.7ms; max = 0.04ms; max
= 0.04ms;
mean = 381.96us; mean = 7.33us; mean
= 1.85us;
stdDev = 824.72us; stdDev = 1.96us; stdDev
= 0.78us;
Speed = 2618.06 /sec Speed = 136405.91 /sec Speed
= 541244.62 /sec
Rate = 31.42 MB/s Rate = 1636.87 MB/s Rate
= 6494.93 MB/s
Create (copy)
nCalls = 603; nCalls = 1420; nCalls
= 3458;
sum = 0.12s; sum = 0.1s; sum
= 0.1s;
min = 0.09ms; min = 0.06ms; min
= 0.03ms;
max = 38.58ms; max = 0.11ms; max
= 0.06ms;
mean = 191.26us; mean = 70.46us; mean
= 28.92us;
stdDev = 1677.21us; stdDev = 5.08us; stdDev
= 2.16us;
Speed = 5228.61 /sec Speed = 14193.19 /sec Speed
= 34579.65 /sec
Rate = 62.74 MB/s Rate = 170.32 MB/s Rate
= 414.96 MB/s
Create (incrementally)
nCalls = 66327; nCalls = 5369; nCalls
= 1821; nCalls = 5541;
sum = 0.5s; sum = 0.5s; sum
= 0.5s; sum = 0.5s;
min = 0ms; min = 0.08ms; min
= 0.25ms; min = 0.08ms;
max = 0.08ms; max = 14.96ms; max
= 0.48ms; max = 0.14ms;
mean = 7.54us; mean = 93.14us; mean
= 274.64us; mean = 90.24us;
stdDev = 1.88us; stdDev = 203us; stdDev
= 13.2us; stdDev = 5.21us;
Speed = 132651.88 /sec Speed = 10736.22 /sec Speed
= 3641.16 /sec Speed = 11081.93 /sec
Rate = 1591.82 MB/s Rate = 128.83 MB/s Rate
= 43.69 MB/s Rate = 132.98 MB/s
Deserialize
nCalls = 93; nCalls = 5440; nCalls
= 8414;
sum = 0.29s; sum = 0.69s; sum
= 0.56s;
min = 2.83ms; min = 0.11ms; min
= 0.06ms;
max = 4.8ms; max = 13.58ms; max
= 0.15ms;
mean = 3087.16us; mean = 126.94us; mean
= 66.9us;
stdDev = 293.96us; stdDev = 182.92us; stdDev
= 6.87us;
Speed = 323.92 /sec Speed = 7877.84 /sec Speed
= 14946.61 /sec
Rate = 3.89 MB/s Rate = 94.53 MB/s Rate
= 179.36 MB/s
DotProduct
nCalls = 925; nCalls = 3065; nCalls
= 4413; nCalls = 2438; nCalls = 3492;
nCalls = 1820; nCalls = 1046; nCalls = 2379;
nCalls = 3450;
sum = 0.1s; sum = 0.1s; sum
= 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s;
min = 0.1ms; min = 0.01ms; min
= 0ms; min = 0.02ms; min =
0.02ms; min = 0.03ms; min = 0.09ms; min = 0.04ms;
min = 0.02ms;
max = 0.14ms; max = 0.05ms; max
= 0.04ms; max = 0.06ms; max =
0.05ms; max = 0.1ms; max = 0.13ms; max = 0.06ms;
max = 0.05ms;
mean = 108.14us; mean = 32.63us; mean
= 22.66us; mean = 41.02us; mean =
28.64us; mean = 54.96us; mean = 95.63us; mean =
42.04us; mean = 28.99us;
stdDev = 6.07us; stdDev = 2.4us; stdDev
= 2.8us; stdDev = 2.8us; stdDev =
2.13us; stdDev = 4.46us; stdDev = 5.3us; stdDev = 2.88us;
stdDev = 2.43us;
Speed = 9247.5 /sec Speed = 30643.87 /sec Speed
= 44126.47 /sec Speed = 24378.05 /sec Speed =
34919.65 /sec Speed = 18194.18 /sec Speed = 10456.76 /sec Speed =
23789.53 /sec Speed = 34496.21 /sec
Rate = 110.97 MB/s Rate = 367.73 MB/s Rate
= 529.52 MB/s Rate = 292.54 MB/s Rate =
419.04 MB/s Rate = 218.33 MB/s Rate = 125.48 MB/s Rate = 285.47
MB/s Rate = 413.95 MB/s
Serialize
nCalls = 117; nCalls = 4694; nCalls
= 7833;
sum = 0.5s; sum = 0.5s; sum
= 0.5s;
min = 4.07ms; min = 0.09ms; min
= 0.05ms;
max = 6.4ms; max = 0.29ms; max
= 0.24ms;
mean = 4308.54us; mean = 106.52us; mean
= 63.83us;
stdDev = 295.07us; stdDev = 17.1us; stdDev
= 13.64us;
Speed = 232.1 /sec Speed = 9387.64 /sec Speed
= 15665.5 /sec
Rate = 2.79 MB/s Rate = 112.65 MB/s Rate
= 187.99 MB/s
org.apache.mahout.common.distance.CosineDistanceMeasure
nCalls = 931; nCalls = 2237; nCalls
= 4216; nCalls = 1563; nCalls = 2416;
nCalls = 1555; nCalls = 3427; nCalls = 2372;
nCalls = 1031; nCalls = 5; nCalls = 5;
sum = 0.1s; sum = 0.1s; sum
= 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.57s; sum = 0.52s;
min = 0.1ms; min = 0.02ms; min
= 0ms; min = 0.06ms; min =
0.02ms; min = 0.06ms; min = 0.02ms; min = 0.04ms;
min = 0.09ms; min = 113.56ms; min = 103.33ms;
max = 0.34ms; max = 0.12ms; max
= 0.05ms; max = 0.1ms; max =
0.06ms; max = 0.21ms; max = 0.05ms; max = 0.09ms;
max = 0.12ms; max = 116.46ms; max = 106.02ms;
mean = 107.51us; mean = 44.72us; mean
= 23.72us; mean = 63.98us; mean =
41.4us; mean = 64.33us; mean = 29.18us; mean =
42.17us; mean = 97.08us; mean = 114758.4us; mean =
104801.4us;
stdDev = 10.18us; stdDev = 4.42us; stdDev
= 2.66us; stdDev = 4.37us; stdDev =
2.72us; stdDev = 7.44us; stdDev = 2.49us; stdDev = 5.3us;
stdDev = 5.06us; stdDev = 1004.59us; stdDev = 946.6us;
Speed = 9301.07 /sec Speed = 22361.28 /sec Speed
= 42149.89 /sec Speed = 15629.69 /sec Speed =
24155.41 /sec Speed = 15544.71 /sec Speed = 34266.57 /sec Speed =
23714.07 /sec Speed = 10300.52 /sec Speed = 8.71 /sec Speed = 9.54
/sec
Rate = 111.61 MB/s Rate = 268.34 MB/s Rate
= 505.8 MB/s Rate = 187.56 MB/s Rate =
289.86 MB/s Rate = 186.54 MB/s Rate = 411.2 MB/s Rate = 284.57
MB/s Rate = 123.61 MB/s Rate = 0.1 MB/s Rate = 0.11 MB/s
org.apache.mahout.common.distance.EuclideanDistanceMeasure
nCalls = 910; nCalls = 2110; nCalls
= 4188; nCalls = 1499; nCalls = 2360;
nCalls = 1512; nCalls = 3229; nCalls = 2376;
nCalls = 1009; nCalls = 5; nCalls = 5;
sum = 0.1s; sum = 0.1s; sum
= 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.59s; sum = 0.54s;
min = 0.1ms; min = 0.04ms; min
= 0.02ms; min = 0.04ms; min =
0.03ms; min = 0.04ms; min = 0.02ms; min = 0.01ms;
min = 0.08ms; min = 117.39ms; min = 107.42ms;
max = 0.32ms; max = 0.22ms; max
= 0.04ms; max = 0.37ms; max =
0.18ms; max = 0.37ms; max = 0.22ms; max = 0.17ms;
max = 0.27ms; max = 118.15ms; max = 109.35ms;
mean = 109.99us; mean = 47.4us; mean
= 23.88us; mean = 66.73us; mean =
42.37us; mean = 66.18us; mean = 30.97us; mean =
42.09us; mean = 99.14us; mean = 117909us; mean =
108320.6us;
stdDev = 21.16us; stdDev = 13.73us; stdDev
= 2.13us; stdDev = 25.49us; stdDev =
11.84us; stdDev = 28.26us; stdDev = 16.2us; stdDev =
11.59us; stdDev = 18.76us; stdDev = 266.16us; stdDev =
680.34us;
Speed = 9091.73 /sec Speed = 21098.31 /sec Speed
= 41871.21 /sec Speed = 14986.1 /sec Speed =
23599.06 /sec Speed = 15111.24 /sec Speed = 32288.38 /sec Speed =
23757.86 /sec Speed = 10086.87 /sec Speed = 8.48 /sec Speed = 9.23
/sec
Rate = 109.1 MB/s Rate = 253.18 MB/s Rate
= 502.45 MB/s Rate = 179.83 MB/s Rate =
283.19 MB/s Rate = 181.33 MB/s Rate = 387.46 MB/s Rate = 285.09
MB/s Rate = 121.04 MB/s Rate = 0.1 MB/s Rate = 0.11 MB/s
org.apache.mahout.common.distance.ManhattanDistanceMeasure
nCalls = 240; nCalls = 386; nCalls
= 183; nCalls = 267; nCalls = 240;
nCalls = 267; nCalls = 380; nCalls = 148;
nCalls = 126; nCalls = 1; nCalls = 1;
sum = 0.1s; sum = 0.1s; sum
= 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.75s; sum = 0.64s;
min = 0.36ms; min = 0.2ms; min
= 0.48ms; min = 0.31ms; min =
0.31ms; min = 0.29ms; min = 0.2ms; min = 0.57ms;
min = 0.4ms; min = 745.96ms; min = 643.84ms;
max = 0.51ms; max = 0.32ms; max
= 0.6ms; max = 0.51ms; max =
15.14ms; max = 0.43ms; max = 0.33ms; max = 1.15ms;
max = 0.85ms; max = 745.96ms; max = 643.84ms;
mean = 417.29us; mean = 259.63us; mean
= 547.05us; mean = 375.66us; mean =
417.11us; mean = 375us; mean = 263.74us; mean =
679.36us; mean = 798.13us; mean = 745958us; mean =
643845us;
stdDev = 19.54us; stdDev = 9.88us; stdDev
= 14.8us; stdDev = 20.06us; stdDev =
952.29us; stdDev = 16.32us; stdDev = 14.46us; stdDev =
51.07us; stdDev = 72.47us; stdDev = 0us; stdDev = 0us;
Speed = 2396.43 /sec Speed = 3851.64 /sec Speed
= 1827.97 /sec Speed = 2661.96 /sec Speed =
2397.46 /sec Speed = 2666.69 /sec Speed = 3791.54 /sec Speed = 1471.96
/sec Speed = 1252.92 /sec Speed = 1.34 /sec Speed = 1.55 /sec
Rate = 28.76 MB/s Rate = 46.22 MB/s Rate
= 21.94 MB/s Rate = 31.94 MB/s Rate = 28.77
MB/s Rate = 32 MB/s Rate = 45.5 MB/s Rate = 17.66 MB/s
Rate = 15.04 MB/s Rate = 0.02 MB/s Rate = 0.02 MB/s
org.apache.mahout.common.distance.SquaredEuclideanDistanceMeasure
nCalls = 923; nCalls = 2121; nCalls
= 4187; nCalls = 1526; nCalls = 2362;
nCalls = 1531; nCalls = 3287; nCalls = 2359;
nCalls = 1028; nCalls = 5; nCalls = 5;
sum = 0.1s; sum = 0.1s; sum
= 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.59s; sum = 0.54s;
min = 0.1ms; min = 0.04ms; min
= 0.01ms; min = 0.05ms; min =
0.02ms; min = 0.05ms; min = 0.02ms; min = 0.02ms;
min = 0.09ms; min = 118.12ms; min = 107.51ms;
max = 0.34ms; max = 0.21ms; max
= 0.04ms; max = 0.32ms; max =
0.18ms; max = 0.37ms; max = 0.2ms; max = 0.19ms;
max = 0.28ms; max = 118.88ms; max = 108.9ms;
mean = 108.42us; mean = 47.17us; mean
= 23.89us; mean = 65.53us; mean =
42.34us; mean = 65.33us; mean = 30.42us; mean = 42.4us;
mean = 97.36us; mean = 118508.8us; mean = 108285.2us;
stdDev = 18.44us; stdDev = 13.04us; stdDev
= 2.06us; stdDev = 20.51us; stdDev =
11.32us; stdDev = 23.57us; stdDev = 13.1us; stdDev =
12.52us; stdDev = 15.26us; stdDev = 317.97us; stdDev =
473.7us;
Speed = 9222.99 /sec Speed = 21201.1 /sec Speed
= 41862.05 /sec Speed = 15259.69 /sec Speed =
23616.22 /sec Speed = 15306.48 /sec Speed = 32867.7 /sec Speed =
23584.34 /sec Speed = 10270.86 /sec Speed = 8.44 /sec Speed = 9.23
/sec
Rate = 110.68 MB/s Rate = 254.41 MB/s Rate
= 502.34 MB/s Rate = 183.12 MB/s Rate =
283.39 MB/s Rate = 183.68 MB/s Rate = 394.41 MB/s Rate = 283.01
MB/s Rate = 123.25 MB/s Rate = 0.1 MB/s Rate = 0.11 MB/s
org.apache.mahout.common.distance.TanimotoDistanceMeasure
nCalls = 849; nCalls = 2181; nCalls
= 4169; nCalls = 1576; nCalls = 2389;
nCalls = 1537; nCalls = 3384; nCalls = 2379;
nCalls = 1035; nCalls = 5; nCalls = 5;
sum = 0.1s; sum = 0.1s; sum
= 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.1s; sum = 0.1s;
sum = 0.1s; sum = 0.57s; sum = 0.52s;
min = 0.1ms; min = 0.02ms; min
= 0ms; min = 0.03ms; min =
0.03ms; min = 0.04ms; min = 0.03ms; min = 0.03ms;
min = 0.09ms; min = 112.26ms; min = 102.65ms;
max = 0.32ms; max = 0.07ms; max
= 0.04ms; max = 0.09ms; max =
0.07ms; max = 0.1ms; max = 0.2ms; max = 0.07ms;
max = 0.15ms; max = 115.49ms; max = 105.42ms;
mean = 117.8us; mean = 45.85us; mean
= 23.99us; mean = 63.48us; mean =
41.87us; mean = 65.1us; mean = 29.56us; mean =
42.04us; mean = 96.65us; mean = 113565.6us; mean =
104200.8us;
stdDev = 24.88us; stdDev = 4.01us; stdDev
= 2.81us; stdDev = 3.86us; stdDev =
2.82us; stdDev = 4.41us; stdDev = 3.78us; stdDev = 2.9us;
stdDev = 5.16us; stdDev = 1149.39us; stdDev = 988.84us;
Speed = 8488.9 /sec Speed = 21808.91 /sec Speed
= 41680.41 /sec Speed = 15751.65 /sec Speed =
23883.55 /sec Speed = 15362.01 /sec Speed = 33831.88 /sec Speed =
23784.05 /sec Speed = 10347 /sec Speed = 8.81 /sec Speed = 9.6
/sec
Rate = 101.87 MB/s Rate = 261.71 MB/s Rate
= 500.16 MB/s Rate = 189.02 MB/s Rate = 286.6
MB/s Rate = 184.34 MB/s Rate = 405.98 MB/s Rate = 285.41 MB/s
Rate = 124.16 MB/s Rate = 0.11 MB/s Rate = 0.12 MB/s
{noformat}
> Cleanup Vector Benchmarks make it less variable
> -----------------------------------------------
>
> Key: MAHOUT-1191
> URL: https://issues.apache.org/jira/browse/MAHOUT-1191
> Project: Mahout
> Issue Type: Bug
> Reporter: Robin Anil
> Assignee: Robin Anil
> Attachments: MAHOUT-1191.patch, MAHOUT-1191.patch
>
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira