I wrote in the past that we should make more use of modular methods and that for this we need optimized low-level operations. I have now made significant progress in this direction. You can find a preview at:
http://www.math.uni.wroc.pl/~hebisch/fricas/mod1 To compile do: )read compi.input Once all files are compiled it should be enough to do: )read load.input The mop-tst.input and misol-tst.input files and comments in matop.spad contain some example inputs. I have tried to stay if possible in Spad, however I use a few lisp macros (in ops32-64.lisp) as primitives. u32vec.spad implements vectors of unsigned 32-bit integers, which are crucial data structure for fast modular computation. matop.spad implements relatively fast modular LU decomposition and modular matrix inverse (we could easily add other operations like determinant). vecrec2.spad is intended to be general package for reconstruction of integer and rational results from modular ones. misol.spad shows sample application: solving system of linear equations with integer coefficients over Q. Earlier versions of misol was posted on this list, but that one is much faster due to faster modular operations. To give you some idea about speed, on my machine modular inverse of 100 by 100 matrix takes about 5.2 ms, LU decomposition about 1.8 ms, misol can solve 100 by 100 system of equations with small random coefficients in 75 ms. Theoretical peak for my machine is 0.208 ms for 100 by 100 matix inverse, so we are still a lot (about 25 times) below theoretical peak. For 1000 by 1000 matrix we need 3.06 sec, while theoretical peak is 0.208 sec so this time we are 15 times below theoretical peak. Still, this is much faster than current generic routines. Currently I get this speed only using 64-bit sbcl (and safety 0 setting). On 32-bit machines or using other Lisp speed is much lower, for example currently on 100 by 100 modular inverse ECL is 15 times slower. However, it should be possible to enhance ECL so that for matop it will generate faster code than sbcl. Also, for ECL penalty for using 32-bit machines should be much smaller than for sbcl. Why having 64-bit machine matter: we want efficient operations for all intermediate quantities which apper in modular computations. If we have to 32 bit numbers then product needs 64 bits. When computing dot products we want to avoid computing remainders of intermediate sums, so we need to add several product without overflow -- for dot product with 1000 terms this limits primes to about 25 bits. If we limited intermediate results to 32 bits, then effectively this whould limit primes to about 10 bits, whic is way too small. I have several other routines at various stage of developement, in particular new routines for guessing and polynomial gcd. My plan is to start adding modular routines to trunk in the near future. Let me add that currently the new code is rather ugly and somewhat incoherent. I want to clean it somewhat and make more generic, but I also feel that it already may give us substantial performance benefits. Also, only putting code to use will show us what needs to be changed. Comments welcome. -- Waldek Hebisch [email protected] --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/fricas-devel?hl=en -~----------~----~----~----~------~----~------~--~---
