On 22.01.2013, at 06:19, Neil Toronto <neil.toro...@gmail.com> wrote:
> On 01/21/2013 04:33 PM, Berthold Bäuml wrote: >> >>> >>> I just did that. Here are the types: >>> >>> real-matrix* : (Array Real) (Array Real) -> (Array Real) >>> >>> flonum-matrix* : (Array Flonum) (Array Flonum) -> (Array Flonum) >>> >>> flmatrix* : FlArray FlArray -> FlArray >>> >>> Results so far, measured in DrRacket with debugging off: >>> >>> Function Size Time >>> ----------------------------------------- >>> matrix* 100x100 340ms >>> real-matrix* 100x100 40ms >>> flonum-matrix* 100x100 10ms >>> flmatrix* 100x100 6ms >>> >>> matrix* 1000x1000 418000ms >>> real-matrix* 1000x1000 76000ms >>> flonum-matrix* 1000x1000 7000ms >>> flmatrix* 1000x1000 4900ms >>> >>> The only difference between `real-matrix*' and `flonum-matrix*' is that the >>> former uses `+' and `*' and the latter uses `fl+' and `fl*'. But if I can >>> inline `real-matrix*', TR's optimizer will change the former to the latter, >>> making `flonum-matrix*' unnecessary. (FWIW, this would be the largest >>> speedup TR's optimizer will have ever shown me.) >>> >>> It looks like the biggest speedup comes from doing only flonum ops in the >>> inner loop sum, which keeps all the intermediate flonums unboxed (i.e. not >>> heap-allocated or later garbage-collected). >>> >>> Right now, `flmatrix*' is implemented a bit stupidly, so I could speed it >>> up further. I won't yet, because I haven't settled on a type for matrices >>> of unboxed flonums. The type has to work with LAPACK if it's installed, >>> which `FlArray' doesn't do because its data layout is row-major and LAPACK >>> expects column-major. >>> >>> I'll change `matrix*' to look like `real-matrix*'. It won't give the very >>> best performance, but it's a 60x speedup for 1000x1000 matrices. >>> >> >> These results look very promising, esp. if , as you mentioned, in the end >> the real-matrix* will automatically reach the flonum-matrix* performance for >> Flonums and the flmatrix* automatically switches to a LAPACK based variant, >> when available. For the latter it would be great if one could even change >> the used library to, e.g., redirect to a installation of the highly >> efficient MKL library from Intel. >> >> Looking forward to benchmark it against Numpy and Mathematica (which is MKL >> based) again! > > The next release candidate will have a `matrix*' that runs as fast as > `flonum-matrix*' on (Matrix Flonum) arguments, and as fast as `real-matrix*' > on (Matrix Real) arguments. Cool! > (Curiously, I wasn't able to speed up `flmatrix*' at all. The "a bit > stupidly" implementation was actually the fastest I could do, which probably > means it's as fast as it can be done in Racket without dropping to C.) > > Additionally, all the matrix functions will have more precise types to make > it easy for TR to prove that operations on (Matrix Flonum) values yield > (Matrix Flonum) values. (Currently, it only proves (Matrix Real) and (Matrix > Number).) Example: > > > (:print-type (matrix-qr (matrix [[1.0 2.0] [3.0 4.0]]))) > (Values (Array Flonum) (Array Flonum)) > > (:print-type (matrix-qr (matrix [[1.0+0.0i 2.0+0.0i] > [3.0+0.0i 4.0+0.0i]]))) > (Values (Array Float-Complex) (Array Float-Complex)) > > The current release candidate returns two (Array Real) and (Array Number) > instead. > > Since I've brought up decompositions, you should know that these, > `matrix-inverse', `matrix-solve', and pretty much every O(n^3) operation > except multiplication will be slow on large matrices. I can't think of a way > to inline them in a way that TR can optimize. For fast versions, which will > use LAPACK if it's installed, you'll have to wait for 5.3.3. :/ Will the LAPACK bindings be in the git head sooner, maybe ;-) > > Is the MKL library binary-compatible with LAPACK? > I did not try it myself yet, but the Intel documentation does say so: "If your application already relies on the BLAS or LAPACK functionality, simply re-link with Intel® MKL to get better performance on Intel and compatible architectures." http://software.intel.com/en-us/intel-mkl/ Berthold > Neil ⊥ > ____________________ Racket Users list: http://lists.racket-lang.org/users