| I've been struggling using FFI bindings to libraries which rely on the
| GNU Mp Bignum library (gmp).
It's an issue that bites very few users, but it bites them hard. It's also tricky, but not impossible, to fix. The combination keeps meaning that at GHC HQ we work on things that affect more people. I doubt we can spare effort to design and implement a fix in the near future -- we keep hoping someone else step up and tackle it!

Peter Tanski did exactly that (he's the author of the ReplacingGMPNotes above), but he's been very quiet recently. I don't know where he is up to. Perhaps someone else would like to join in?

Thank you for the information - I'm also willing to help, though I'm not too familiar with the GHC internals (yet). I do like the idea of optionally linking with a pure-haskell library, but I'm interested in a solution with comparable performance. Commenting solutions to ticket #311:

(1) Creating a custom variant of the gmp lib by renaming symbols and possibly removing unneccessary functionality, as suggest by Simon Marlow in ticket #311 would be relatively straightforward; I've already tried this approach the other way round (i.e. recompile libraries to be used with the FFI). But it means that you'd have to maintain and ship another library, so I guess it is not an option for the GHC team.

(2) Using the standard allocation functions for the gmp memory managment (maybe as compile flag) as suggested in http:// www.haskell.org/pipermail/glasgow-haskell-users/2006-July/010660.html would also resolve ticket #311. In this case at least the dynamic part of gmp integers has to be resized using external allocation functions, and a finalizer (mpz_clear) has to be called when an Integer is garbage collected. It seems that the performance loss by using malloc is significant [1], as lots of allocations and reallocations of very small chunks occur in a functional setting; some kind of (non garbage collected !) memory pool allocation would certainly help. I'm not sure what overhead is associated with calling a finalizer ?

(3) So when replacing GMP with the BN library of OpenSSL (see http:// hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/ PerformanceMeasurements), it would propably be neccessary to refactor the library, so custom allocation can be used as well. This does not seem too difficult at a first glance though.

So I'd like to investigate the second or third option, as far as my knowledge and time permits it. Of course it would be wise to check first if Peter Tanski is already/ still working on a GMP replacement.

Benedikt


[1]
Simple Performance Test on (ghc-darwin-i386-6.6.1):

The haskell function (k was taken as 10M)
> test k = (iterateT k (fromIntegral (maxBound ::Int))) :: Integer where
>    iterateT 0 v = v; iterateT k v = v `seq` iterateT (k-1) (v+10000)
triggers around k allocations and k reallocations by the gmp library.

The rough C equivalent, calling sequences of
> malloc(3), mpz_init_set(gmp), mpz_add_ui(gmp), mpz_clear(gmp) and free(3), takes more than 2 times as long, with 25% of the time spend in allocating and freeing pointers to gmp integers (mpz_ptr) and 50% of the time spend in gmp allocator functions (i.e. resizing gmp integers = (re)allocating limbs).

I also performed the test with the datatype suggested by John Meacham (using a gmp library with renamed symbols),
> data FInteger = FInteger Int# (!ForeignPtr Mpz)
but it was around 8x slower, maybe due to the ForeignPtr and FFI overhead, or due to missing optimizations in the code.
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Reply via email to