Thanks for the quick reply! I've pasted a small self-contained example at the bottom. It creates the matrices in advance, but nothing meaningful changes if they're created on a per-operation basis.
Results for 50 multiplications of [size]x[size] matrices: Size: 10, total time: 0.012 seconds, time per: 0.000 seconds Size: 100, total time: 0.062 seconds, time per: 0.001 seconds Size: 300, total time: 3.050 seconds, time per: 0.061 seconds Size: 500, total time: 15.186 seconds, time per: 0.304 seconds Size: 600, total time: 17.532 seconds, time per: 0.351 seconds For comparison: Results for 50 additions of [size]x[size] matrices (which should be faster, be the extent of the difference is nonetheless striking to me): Size: 10, total time: 0.011 seconds, time per: 0.000 seconds Size: 100, total time: 0.012 seconds, time per: 0.000 seconds Size: 300, total time: 0.020 seconds, time per: 0.000 seconds Size: 500, total time: 0.032 seconds, time per: 0.001 seconds Size: 600, total time: 0.050 seconds, time per: 0.001 seconds Results for 50 inversions of a [size]x[size] matrix, which I'd expect to be slower than multiplication for larger matrices: Size: 10, total time: 0.014 seconds, time per: 0.000 seconds Size: 100, total time: 0.074 seconds, time per: 0.001 seconds Size: 300, total time: 1.005 seconds, time per: 0.020 seconds Size: 500, total time: 5.024 seconds, time per: 0.100 seconds Size: 600, total time: 9.297 seconds, time per: 0.186 seconds I hope this is useful, and if I'm doing something wrong that's leading to this performance gap, I'd love to know. ---- import org.apache.commons.math3.linear.LUDecomposition; import org.apache.commons.math3.linear.Array2DRowRealMatrix; import org.apache.commons.math3.linear.RealMatrix; public class AMCMatrices { public static void main(String[] args) { miniTest(0); } public static void miniTest(int tType) { int samples = 50; int sizes[] = { 10, 100, 300, 500, 600 }; for (int sI = 0; sI < sizes.length; sI++) { int mSize = sizes[sI]; org.apache.commons.math3.linear.RealMatrix m0 = buildM(mSize, mSize); RealMatrix m1 = buildM(mSize, mSize); long start = System.nanoTime(); for (int n = 0; n < samples; n++) { switch (tType) { case 0: m0.multiply(m1); break; case 1: m0.add(m1); break; case 2: new LUDecomposition(m0).getSolver().getInverse(); break; } } long end = System.nanoTime(); double dt = ((double) (end - start)) / 1E9; System.out.println(String.format( "Size: %d, total time: %3.3f seconds, time per: %3.3f seconds", mSize, dt, dt / samples)); } } public static Array2DRowRealMatrix buildM(int r, int c) { double[][] matVals = new double[r][c]; for (int i = 0; i < r; i++) { for (int j = 0; j < c; j++) { matVals[i][j] = Math.random(); } } return new Array2DRowRealMatrix(matVals); } } ---- On 5 April 2016 at 19:36, Gilles <gil...@harfang.homelinux.org> wrote: > Hi. > > On Tue, 5 Apr 2016 15:43:04 +0100, Chris Lucas wrote: > >> I recently ran a benchmark comparing the performance math commons 3.6.1's >> linear algebra library to the that of scala Breeze ( >> https://github.com/scalanlp/breeze). >> >> I looked at det, inverse, Cholesky factorization, addition, and >> multiplication, including matrices with 10, 100, 500, and 1000 elements, >> with symmetric, non-symmetric, and non-square cases where applicable. >> > > It would be interesting to add this to the CM documentation: > https://issues.apache.org/jira/browse/MATH-1194 > > In general, I was pleasantly surprised: math commons performed about as >> well as Breeze, despite the latter relying on native libraries. There was >> one exception, however: >> >> m0.multiply(m1) >> >> where m0 and m1 are both Array2DRowRealMatrix instances. It scaled very >> poorly in math commons, being much slower than nominally more expensive >> operations like inv and the Breeze implementation. Does anyone have a >> thought as to what's going on? >> > > Could your provide more information such as a plot of performance vs size? > A self-contained and minimal code to run would be nice too. > See the CM micro-benchmarking tool here: > > https://github.com/apache/commons-math/blob/master/src/test/java/org/apache/commons/math4/PerfTestUtils.java > And an example of how to use it: > > https://github.com/apache/commons-math/blob/master/src/userguide/java/org/apache/commons/math4/userguide/FastMathTestPerformance.java > > In case it's useful, one representative test >> involves multiplying two instances of >> >> new Array2DRowRealMatrix(matVals) >> >> where matVals is 1000x1000 entries of math.random and the second instance >> is created as part of the loop. This part of the benchmark is not specific >> to the expensive multiplication step, and takes very little time relative >> to the multiplication itself. I'm using System.nanotime for the timing, >> and >> take the average time over several consecutive iterations, on a 3.5 GHz >> Intel Core i7, Oracle JRE (build 1.8.0_05-b13). >> > > You might want to confirm the behaviour using JMH (becoming the Java > standard > for benchmarking): > http://openjdk.java.net/projects/code-tools/jmh/ > > > Best regards, > Gilles > > >> Thanks, >> >> Chris >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > >