ghc, tls, gmp
Hi, in Debian, we have the (well-known) problem of linking against libraries using libgmp, in this case haskell-curl, which links against libcurl, which links against gnutls, which uses libgmp since the latest release: https://lists.debian.org/debian-haskell/2014/07/msg0.html Are there any viable solutions to this problem that I might not be aware of? Are there any solutions to be expected in the near future? (My best idea so far is to use libcurl linked against openssl, but this causes licensing issues.) Greetings, Joachim -- Joachim nomeata Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: F0FBF51F JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On Thu, Jan 5, 2012 at 5:44 PM, Joachim Breitner nome...@debian.org wrote: Anyways, I’m just brainstorming a bit, although it seems I’m only turning up ideas that have been dismissed already... It is good to hash through these things. =) On a related note, which doesn't solve the general problem, I have been working with Dan Peebles on a nice set of MPFR bindings for my own purposes using custom foreign prims. I unpacked the main MPFR structures and let the ghc garbage collector move everything around, like it does with GMP itself. We've been able to get a version that ALMOST works as expected. That is to say it works perfectly unless you need to access a function that pulls from its built-in constant cache. (MPFR internally caches the result of computing the first n digits of pi or log of 2, etc. growing the cache as you demand longer numbers.) Dan was able to swap out the ghc gmp allocation hook for a slightly slower one that checks to see if it is being called from the MPFR cache management function, and diverting the allocation back to malloc. Ideally we would swap the handler in an initializer when the library loads, and this works perfectly in ghc, but not ghci -- which apparently links in libraries in a way where c++ style initializers don't get invoked? A 'replaceAllocator' IO action that swaps the gmp allocation hook isn't a very Haskelly solution either, because I'd prefer to just have it look like another numeric type. Dan is currently investigating including a patched copy of MPFR in the haskell package, which doesn't try to use the built-in allocator for the constant cache. This is where it becomes relevant to the discussion at hand, because it would effectively involve linking in our own copy of MPFR, making distributions unhappy. But another option would be to unsafePerformIO that initializer, which would add a bit of overhead, going through an indirection to make sure that replaceAllocator had been forced, perhaps it wouldn't be too bad: Something like: replaceAllocator :: () replaceAllocator = unsafePerformIO replaceAllocatorIO {-# NOINLINE replaceAllocator #-} instance (Rounding r, Precision p) = Floating (Fixed r p) where pi = replaceAllocator `pseq` mpfr_pi ... This doesn't address general purpose use of third party libraries that happen to internally rely upon GMP, however. Or rather, if it does, the methodology it would lead to would be one of bringing over all of their internals into Haskell. ;) -Edward Kmett ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On 04/01/2012 21:00, Joachim Breitner wrote: Hi, Am Mittwoch, den 04.01.2012, 20:50 +0100 schrieb Axel Simon: On 04.01.2012, at 17:50, Joachim Breitner wrote: BTW, Is there a way to get the linker to create two independent copies of a library in one program space? Maybe if it is compiled as PIC (random name dropping here)? That would seem to be an elegant solution, as it makes the distro packers happy and you would not have to maintain a code copy. In the past, I've linked a C++ library that used gmpxx against a Haskell program by renaming all symbols starting with gmp to mygmp using objcopy. Unfortunately, this is not portable and completely broke down on Mac OS when Apple moved to fat binaries (Intel and PowerPC) since their objcopy version was crippled (there doesn't even seem to be an objcopy anymore on later OS X versions). Thus, renaming symbols after compilation is non-portable and sometimes not possible without writing your own tool. So, I propose to revert to renaming the symbols in the source code which could probably be done automatically using a lot of CPP #defines, starting from some sort of source code tar ball of gmp. This would also allow to always use the latest gmp sources without too much hassle. Just to more random ideas that can probably easily dismissed by more knowledgeable people: Would linking gmp statically help? E.g. is there a way to link libgmp into the RTS that the symbols are not visible to the linker any more? Linking a static copy of GMP into the integer-gmp package is an interesting idea that I hadn't considered before. It ought to be possible, but I haven't played around with it. You would probably have to resolve all the symbols between the integer-gmp code and GMP itself when building the library, and don't expose any GMP symbols in the resulting .a file. That might mean making a big .o file containing GMP and the integer-gmp code that refers to it. I expect there would be problems with shared libraries though. You can't link the static GMP into the shared integer-gmp, because the static GMP isn't built with -fPIC. And would dlopen make a difference? RTLD_LOCAL sounds interesting... Maybe, I haven't looked into that. Cheers, Simon Greetings, Joachim ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On Thu, Jan 5, 2012 at 09:37, Simon Marlow marlo...@gmail.com wrote: On 04/01/2012 21:00, Joachim Breitner wrote: And would dlopen make a difference? RTLD_LOCAL sounds interesting... Maybe, I haven't looked into that. Beware of platform issues; IIRC RTLD_LOCAL doesn't do what one expects on Alphas. -- brandon s allbery allber...@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On 04/01/2012 21:00, Joachim Breitner wrote: Would linking gmp statically help? E.g. is there a way to link libgmp into the RTS that the symbols are not visible to the linker any more? It has slightly more licensing complications - GMP is LGPL, which requires that the user of a program that includes it be able to replace GMP. Shared libraries do that easily. Static linking would probably require someone who compiles+distributes a program that uses both Haskell and GMP/MPFR to distribute their .o files (or alternatively their source code), so that they can be re-linked to a different GMP. (Their program still does *not* have to be LGPLed.) The parties I've heard publicly concerned about duplicate packages are Linux/BSD distros that primarily support Free Software. So that complication might not even be a problem (provided the GHC user would get a choice whether to staticly link, and probably default to dynamic linking). ~Isaac ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
Hi, Am Mittwoch, den 04.01.2012, 20:50 +0100 schrieb Axel Simon: On 04.01.2012, at 17:50, Joachim Breitner wrote: BTW, Is there a way to get the linker to create two independent copies of a library in one program space? Maybe if it is compiled as PIC (random name dropping here)? That would seem to be an elegant solution, as it makes the distro packers happy and you would not have to maintain a code copy. In the past, I've linked a C++ library that used gmpxx against a Haskell program by renaming all symbols starting with gmp to mygmp using objcopy. Unfortunately, this is not portable and completely broke down on Mac OS when Apple moved to fat binaries (Intel and PowerPC) since their objcopy version was crippled (there doesn't even seem to be an objcopy anymore on later OS X versions). Thus, renaming symbols after compilation is non-portable and sometimes not possible without writing your own tool. let me pick up this idea again. It was pointed out that it is mostly Linux and BSD based distributions that dislike code copies. For these, the objcopy-way might work. It does not work on MacOS, but that target does not have the code copy requirements.. So would it be possible to * use objcopy at GHC build-time to take the system libgmp shared library, rename the symbols, and install the modified library with ghc on architectures that support it, * have a code copy for those who don’t? For the distros this has the nice effect that if there is a bugfix in libgmp, they just have to rebuild ghc without any changes to the sources to benefit from it. (But I am really wondering why the linker cannot do something that has the same effect as objcopy --prefix-symbols, but on the fly.) Greetings, Joachim -- Joachim nomeata Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
Hi, Am Mittwoch, den 04.01.2012, 22:00 +0100 schrieb Joachim Breitner: And would dlopen make a difference? RTLD_LOCAL sounds interesting... it seems that some OSs provide a RTLD_PRIVATE which does exactly what we need: http://uw714doc.sco.com/en/man/html.3C/dlopen.3C.html But unfortunately, glibc does not seem to support it. But still, RTLD_LOCAL might be enough: RTLD_LOCAL This is the converse of RTLD_GLOBAL, and the default if nei‐ ther flag is specified. Symbols defined in this library are not made available to resolve references in subsequently loaded libraries. Greetings, Joachim -- Joachim nomeata Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On Thu, Jan 5, 2012 at 13:53, Joachim Breitner nome...@debian.org wrote: (But I am really wondering why the linker cannot do something that has the same effect as objcopy --prefix-symbols, but on the fly.) Some of them can; notably the binutils ld, which comes from the same source as and uses the same mechanism as objcopy. The linker on OS X doesn't have this particular mechanism (although I think it has other features that could be used toward the same end, if not in the same way), and binutils ld does not work properly on OS X (nor does objcopy, again because it's the same mechanisms; the BFD library doesn't support Mach-O properly. This is probably why Apple at first shipped a crippled version and later removed it entirely. You can install a full set from MacPorts, but it breaks a *lot* of stuff...). And Solaris's ld has still another way to do this kind of thing, IIRC. -- brandon s allbery allber...@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
Hi, Am Donnerstag, den 05.01.2012, 14:14 -0500 schrieb Brandon Allbery: On Thu, Jan 5, 2012 at 13:53, Joachim Breitner nome...@debian.org wrote: (But I am really wondering why the linker cannot do something that has the same effect as objcopy --prefix-symbols, but on the fly.) Some of them can; notably the binutils ld, which comes from the same source as and uses the same mechanism as objcopy. they do? I couldn’t any flag in that direction. I just tried a few combinations with a simple testprogram (git repo attached, every commit is one state). Unfortunately, using dlopen does not not help if some other shared library loads gmp regularly; there is still only one global state. And statically linking gmp into the share object does not work, as Simon predicted, because the static gmp library is not compiled with PIC. The promising ld option -Bgroup does not seem to have the desired effect. If only dlopen had RTDL_PRIVATE support... Greetings, Joachim -- Joachim nomeata Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata gmp-link-test.tar.gz Description: application/compressed-tar signature.asc Description: This is a digitally signed message part ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On Thu, Jan 5, 2012 at 16:15, Joachim Breitner nome...@debian.org wrote: Am Donnerstag, den 05.01.2012, 14:14 -0500 schrieb Brandon Allbery: On Thu, Jan 5, 2012 at 13:53, Joachim Breitner nome...@debian.org wrote: (But I am really wondering why the linker cannot do something that has the same effect as objcopy --prefix-symbols, but on the fly.) Some of them can; notably the binutils ld, which comes from the same source as and uses the same mechanism as objcopy. they do? I couldn’t any flag in that direction. Hrm. I thought it did. Possibly it requires an ld-script, or I'm confusing it with Solaris ld. In any case, I am starting to approach the point of so will Debian allow ghc to remain compatible with non-Linux?, since so far I'm getting the distinct impression that solutions that work on Linux are all that matter. -- brandon s allbery allber...@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
* Simon Marlow: One potential problem is that some Linux distributions really don't like it if you bundle modified versions of external libraries. However, I just don't see a way around this: GMP is inherently broken because it has global state, so if you want two use it from two clients in the same program, you need two copies of it. Is this about the allocation functions? You could use the mpn functions (instead of the mpz variants), where you need to supply your own memory regions. It's not entirely straightforward because you need to calculate the expected lengths, manage the sign bit, and ensure that a few preconditions hold. The only thing which appears to be really missing is modular exponentiation (mpz_powm and mpz_powm_ui), but GHC doesn't seem to export them (huh?). Sure, it's quite a bit of work, but I expect it's more portable than hairy linker tricks. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
Hi, Am Donnerstag, den 05.01.2012, 16:21 -0500 schrieb Brandon Allbery: In any case, I am starting to approach the point of so will Debian allow ghc to remain compatible with non-Linux?, since so far I'm getting the distinct impression that solutions that work on Linux are all that matter. no, not at all. But I was under the impression that code copies will be fine for systems besides Linux and BSDs, and we are looking for a different solution especially for those. Anyways, I’m just brainstorming a bit, although it seems I’m only turning up ideas that have been dismissed already... Greetings, Joachim -- Joachim nomeata Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On 22/12/2011 22:58, Michal Konečný wrote: Several issues related to the way GMP is included in GHC were publicly discussed in the past with the goal of replacing GMP. As summarised in this wiki by Peter Tanski http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes, the main issues were: (1) Licensing (2) Memory Structure; Simultaneous Access to GMP by Foreign (C) code in the Same Binary. My immediate concern is (2) as I develop programs that link with MPFR, which uses GMP. So far I got around the issue by compiling GHC with integer-simple. This is fine for me but I feel the need to recompile ghc may be hindering some people who would like to try my software that links with MPFR. I have therefore been looking for a way to overcome (2) that could make it to the default build of GHC. As suggested by Simon PJ in this email http://www.haskell.org/pipermail/glasgow-haskell-users/2006-August/010676.html, a simple way to deal with issue (2) is to *copy* GMP, changing all the function names. I recently found some time and had a go at implementing this suggestion and it turned out to be quite easy. I described my recipe to do such a copy here http://code.google.com/p/hmpfr/wiki/GHCWithRenamedGMP but currently I have been able to test it only on x86 Linux and GHC 7.2.1. I checked that the renaming makes a difference by running all QuickCheck tests of my AERN-Real-MPFR package. Some of these tests persistently fail when compiled with a standard GHC. All tests succeed with GHC/integer-simple and also with GHC 7.2.1 using the renamed GMP. I was wondering whether such a change (more polished and tested on all common platforms) could make it to the future official GHC releases. If the GHC developers would support this change, I would be happy to put more work into it. I think the following concrete changes would be required in the GHC distribution: (a) make a ghc build always use the bundled GMP (b) apply a renaming script onto the GMP tar before adding it to the GHC source bundle (c) rename symbols analogously in integer-gmp/cbits/* An alternative to (b) is to apply the renaming to the GMP sources just before building it. One thing I am not clear about is the impact such a change would have on lincensing. My understanding is that GHC sources include GMP sources and in the absence of an installed GMP library, they are statically linked into the integer-gmp package. (Does it mean that the integer-gmp package should be LGPL lincensed?) The suggested changes would mean that this kind of linking would happen always, not only when no other GMP library is available on the system. In either case, it seems to me that using integer-gmp as a shared library would still be OK for producing non-LGPL code. I would be grateful for your views and guidance. Ok, as I understand it this would be fine from a licensing perspective: we would be modifying the source, but distributing the modifications (either as a patch or a script, it doesn't matter). One potential problem is that some Linux distributions really don't like it if you bundle modified versions of external libraries. However, I just don't see a way around this: GMP is inherently broken because it has global state, so if you want two use it from two clients in the same program, you need two copies of it. If these Linux distributions still kick up a fuss, then we would have to back off and not bundle the modified GMP, but then users of GHC on those distros would have to install their own local copy of GHC in order to use MPFR or some other GMP client. Can anyone involved with packaging for Debian or Fedora comment? Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
Dear Michal, Am Mittwoch, den 04.01.2012, 16:33 + schrieb Michal Konečný: On Wednesday 04 January 2012 12:31:23 Joachim Breitner wrote: I guess this means me... Indeed Debian has the policy to avoid modified bundled libraries, if somehow possible. For example, we patch the build system to use the system-provided libffi. I am curious about the precise definition of bundled libraries. It can be arranged that the GMP source is modified at GHC build time, so the _source_ package contains the original unmodified tar of GMP (except without documentation). Nevertheless, the _binary_ GHC package will contain integer-gmp library files that contain a binary copy of GMP whose symbols have been renamed. Does this count as a modified bundled library? (I am guessing yes.) If such binary bundling is not permissible, would it ok to have a separate Debian package called eg libghcgmp3c2 which is equal to libgmp3c2 except the exported symbols are renamed as expected by a new integer-gmp and the files are suitably renamed to avoid any conflict with libgmp3c2? both would be no better than having a modified copy in the ghc tarball. This is not a formal requirement but rather a guideline with a rationale that code should be shared, not copied. The most prominent reason is security fixes: If code is copied and a security hole is found, the security team needs to hunt down all copies. With a single shared library, this is not a problem (zlib has been repeatedly a “good” example of this problem). Now you might argue that gmp will never be the source of security problems (although I woudn’t be too convinced about that). But even then regular bug fixes and arch-specific fixes (which were required once for s390) in the main gmp library would not reach GHC automatically. The guideline is in place in Debian also because we think it is the right thing to do, even if sometime more work, for a better and healthier ecosystem. So in conclusion: If you just cannot use the regular GMP library, then just copy it and live with the bad effects. You do not have to put effort in to make it look “nicer” (such as putting it in a separate library package). But preferably, try hard to avoid this issue, also for your own benefit. BTW, Is there a way to get the linker to create two independent copies of a library in one program space? Maybe if it is compiled as PIC (random name dropping here)? That would seem to be an elegant solution, as it makes the distro packers happy and you would not have to maintain a code copy. On Wednesday 04 January 2012 12:21:13 Simon Marlow wrote: GMP is inherently broken because it has global state, so if you want two use it from two clients in the same program, you need two copies of it. If this could be fixed that would be fantastic. Nevertheless, I am currently unaware of how hard this might be to persue, technically or politically. (My gut feeling is that it is not straightforward.) Someone (I know, not a helpful way to start a sentence :-)) should ask upstream before we make guesses. Greetings, Joachim -- Joachim nomeata Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On Wed, Jan 4, 2012 at 11:50, Joachim Breitner nome...@debian.org wrote: Now you might argue that gmp will never be the source of security problems (although I woudn’t be too convinced about that). But even then There's actually a patch for a (claimed to be minor potential) security issue referenced on the releases page at the moment. -- brandon s allbery allber...@gmail.com wandering unix systems administrator (available) (412) 475-9364 vm/sms ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
Hi all, On 04.01.2012, at 17:50, Joachim Breitner wrote: BTW, Is there a way to get the linker to create two independent copies of a library in one program space? Maybe if it is compiled as PIC (random name dropping here)? That would seem to be an elegant solution, as it makes the distro packers happy and you would not have to maintain a code copy. In the past, I've linked a C++ library that used gmpxx against a Haskell program by renaming all symbols starting with gmp to mygmp using objcopy. Unfortunately, this is not portable and completely broke down on Mac OS when Apple moved to fat binaries (Intel and PowerPC) since their objcopy version was crippled (there doesn't even seem to be an objcopy anymore on later OS X versions). Thus, renaming symbols after compilation is non-portable and sometimes not possible without writing your own tool. So, I propose to revert to renaming the symbols in the source code which could probably be done automatically using a lot of CPP #defines, starting from some sort of source code tar ball of gmp. This would also allow to always use the latest gmp sources without too much hassle. My 2p, Axel ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: renamed GMP symbols in GHC
On Wed, Jan 04, 2012 at 12:31:23PM +, Joachim Breitner wrote: One potential problem is that some Linux distributions really don't like it if you bundle modified versions of external libraries. However, I just don't see a way around this: [...] [...] I guess this means me... Indeed Debian has the policy to avoid modified bundled libraries, if somehow possible. For example, we patch the build system to use the system-provided libffi. This policy isn't even specific to linux distributions ;-) I don't know about the package building infrastructure for debian or fedora, but for openbsd (where i'm doing a lot of haskell stuff), it would be enough if the ghc sources would include not only a (patched or unpatched) gmp source tree but also the ghc-specific patches to gmp. The rationale behind this polcicy (for openbsd, i can't speak for debian): if there are 42 packages where the source distribution files contain their own (probably patched) version of gmp, and suddenly a critical patch has to be applied to gmp, we would have to apply it 43 times (for gmp itself and for all the 42 packages using a bundled gmp). If the source distribution files contained diffs for gmp, we could (at least try to) extract our patched gmp and apply the diff on top of it. = less work, any openbsd-specific patch automatically will be applied to all 42 packages. Ciao, Kili ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
renamed GMP symbols in GHC
Dear all, Several issues related to the way GMP is included in GHC were publicly discussed in the past with the goal of replacing GMP. As summarised in this wiki by Peter Tanski, the main issues were: (1) Licensing (2) Memory Structure; Simultaneous Access to GMP by Foreign (C) code in the Same Binary. My immediate concern is (2) as I develop programs that link with MPFR, which uses GMP. So far I got around the issue by compiling GHC with integer-simple. This is fine for me but I feel the need to recompile ghc may be hindering some people who would like to try my software that links with MPFR. I have therefore been looking for a way to overcome (2) that could make it to the default build of GHC. As suggested by Simon PJ in this email, a simple way to deal with issue (2) is to *copy* GMP, changing all the function names. I recently found some time and had a go at implementing this suggestion and it turned out to be quite easy. I described my recipe to do such a copy here but currently I have been able to test it only on x86 Linux and GHC 7.2.1. I checked that the renaming makes a difference by running all QuickCheck tests of my AERN-Real-MPFR package. Some of these tests persistently fail when compiled with a standard GHC. All tests succeed with GHC/integer-simple and also with GHC 7.2.1 using the renamed GMP. I was wondering whether such a change (more polished and tested on all common platforms) could make it to the future official GHC releases. If the GHC developers would support this change, I would be happy to put more work into it. I think the following concrete changes would be required in the GHC distribution: (a) make a ghc build always use the bundled GMP (b) apply a renaming script onto the GMP tar before adding it to the GHC source bundle (c) rename symbols analogously in integer-gmp/cbits/* An alternative to (b) is to apply the renaming to the GMP sources just before building it. One thing I am not clear about is the impact such a change would have on lincensing. My understanding is that GHC sources include GMP sources and in the absence of an installed GMP library, they are statically linked into the integer-gmp package. (Does it mean that the integer-gmp package should be LGPL lincensed?) The suggested changes would mean that this kind of linking would happen always, not only when no other GMP library is available on the system. In either case, it seems to me that using integer-gmp as a shared library would still be OK for producing non-LGPL code. I would be grateful for your views and guidance. Best regards, Michal -- |o| Michal Konecny mikkone...@gmail.com |o|http://www-users.aston.ac.uk/~konecnym/ |o|office: (+42) (0)121 204 3462 |o| PGP key http://www-users.aston.ac.uk/~konecnym/ki.aston signature.asc Description: This is a digitally signed message part. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Confusion about GHC and GMP
I assume you mean http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes#CurrentStatus in which case, yes. I assume that INTEGER_LIBRARY=integer-foo is an option when compiling GHC itself and not when using GHC, correct? Yes. Great, thanks for your help Ian. Building GHC with INTEGER_LIBRARY=integer-simple is exactly what I was looking for. So naturally though, I'd like to avoid maintaining an internal branch of GHC. Are there any plans to make integer-simple the default integer library? I was able to build GHC on Linux with no trouble. But on Windows, I go down because (I think) the preprocessor is choking on ^M characters. I get errors like this: libraries\haskeline\.\System\Console\Haskeline\Backend\Win32.hsc:153: error: syntax error before ')' token libraries\haskeline\.\System\Console\Haskeline\Backend\Win32.hsc: In function `main': libraries\haskeline\.\System\Console\Haskeline\Backend\Win32.hsc:155: error: syntax error before ';' token compiling libraries/haskeline/dist-install/build/System/Console/Haskeline/Backend/Win32_hsc_make.c After removing those ^M's from Win32.hsc and Win32_hsc_make.c, things are moving along, slowly but surely. The build is using the inplace gcc so I assume the problem is not gcc, but whatever generates those files. Any idea how the extra line feeds slip in? Thanks, Greg ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Confusion about GHC and GMP
On Fri, Feb 12, 2010 at 01:17:13PM -0800, Greg Fitzgerald wrote: I wonder if someone might be able to clear a few things up for me about implementing Integer with GMP. First, is the link below the most up-to-date information regarding GHC's situation? http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes#BinaryDropinReplacementforGMP I assume you mean http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes#CurrentStatus in which case, yes. I assume that INTEGER_LIBRARY=integer-foo is an option when compiling GHC itself and not when using GHC, correct? Yes. If so, what version of GHC was this added? 6.10, 6.12, HEAD? 6.12. Lastly, I'm confused about GHC's posted license, should it actually be LGPL so long as it statically links GMP? IANAL, but on most platforms GMP is dynamically linked, and see also http://haskell.forkio.com/gmpwindows Thanks Ian ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Confusion about GHC and GMP
I wonder if someone might be able to clear a few things up for me about implementing Integer with GMP. First, is the link below the most up-to-date information regarding GHC's situation? http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes#BinaryDropinReplacementforGMP I assume that INTEGER_LIBRARY=integer-foo is an option when compiling GHC itself and not when using GHC, correct? If so, what version of GHC was this added? 6.10, 6.12, HEAD? Lastly, I'm confused about GHC's posted license, should it actually be LGPL so long as it statically links GMP? Aren't there a number of companies releasing binaries that were compiled with GHC? Thanks, Greg ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build of HEAD fails in integer-gmp
On 25/11/2009 21:20, Bryan O'Sullivan wrote: I get this error message, which suggests that the gmp-wrappers file should be compiled with -fPIC but isn't being: /usr/bin/ld: libraries/integer-gmp/dist-install/build/cbits/gmp-wrappers.dyn_o: relocation R_X86_64_PC32 against undefined symbol `__gmpz_init' can not be used when making a shared object; recompile with -fPIC This was temporary breakage, fixed a couple of days ago: http://www.haskell.org/pipermail/cvs-ghc/2009-November/051393.html I'm not sure why google isn't indexing the cvs-ghc archives, otherwise that would have been an easy one to find. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Build of HEAD fails in integer-gmp
I get this error message, which suggests that the gmp-wrappers file should be compiled with -fPIC but isn't being: /usr/bin/ld: libraries/integer-gmp/dist-install/build/cbits/gmp-wrappers.dyn_o: relocation R_X86_64_PC32 against undefined symbol `__gmpz_init' can not be used when making a shared object; recompile with -fPIC ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
chak: Yitzchak Gale: OK for the time being, but it would be really, really good to be able to compile ghc without gmp. Well, just go ahead and write an alternative portable high- performance implementation of Integer. This idea of a Mac OS X binary with statically-linked gmp is nice, it is really convenient. But someone needs to completely clarify the license issues in that case, and make it completely clear to all users. I completely agree that we need to document the situation clearly. Otherwise, I don't really see what the fuss is about. Most GHC users don't care whether GMP is linked into their code (as its either only used internally or has a GPL-compatible license anyway). If a company wants to distribute GHC compiled binaries of non-GPL compatible code, well, they have to compile their own version of GHC on the Mac that links GMP dynamically, and then, use that version of GHC to link their final product. That is going to be a trivial task compared to developing that product in the first place. So, who cares? I agree: there's a lot of effort here without an obvious demand. Do we know of anyone not using GHC commercially because the can't use GMP? -- Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: building against gmp in a nonstandard location
Ian Lynagh [EMAIL PROTECTED] wrote: the alternative would be to put the whole of $LDFLAGS into the Cabal buildinfo, but that feels wrong to me. Why so? By setting LDFLAGS, isn't the user asking for exactly that - that those flags be used when linking? paul ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: building against gmp in a nonstandard location
On Sun, Mar 30, 2008 at 03:37:35PM -0400, Paul Jarc wrote: Ian Lynagh [EMAIL PROTECTED] wrote: the alternative would be to put the whole of $LDFLAGS into the Cabal buildinfo, but that feels wrong to me. Why so? By setting LDFLAGS, isn't the user asking for exactly that - that those flags be used when linking? Perhaps, but readline is only 1 component of what is being built. I guess it would be unlikely to do any harm to use linker flags intended for another component, though. If we want to go this route then I think Cabal should pick up $LDFLAGS automatically when building packages, though. Thanks Ian ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: building against gmp in a nonstandard location
On Fri, Mar 21, 2008 at 01:44:37AM -0400, Paul Jarc wrote: There are no --with-* flags for ncurses, so the only ways I can see to say where ncurses is are the ones I'm already using, and which are apparently insufficient - $CPPFLAGS/$LDFLAGS in the environment, and SRC_HC_OPTS/SRC_HSC2HC_OPTS/SRC_CC_OPTS/GHC_CC_OPTS in mk/build.mk. I think the right thing to do is to add --with-* flags for (n)curses/termcap; the alternative would be to put the whole of $LDFLAGS into the Cabal buildinfo, but that feels wrong to me. We should probably also be undefining variables like LDFLAGS in configure, as the autoconf macros will be using them to check if libraries exist, but we don't use them when compiling. Thanks Ian ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: building against gmp in a nonstandard location
Hi Paul, On Sun, Mar 16, 2008 at 12:05:03AM -0400, Paul Jarc wrote: make[1]: *** No rule to make target `stage1//package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/gmp/include/gmp.h', needed by `stage1/parser/cutils.o'. Stop. Where is gmp.h actually installed? Do you know where this dependency is coming from? Is it compiler/.depend-1? If so, try looking at the mkdependC commandline and see if that provides any clues. Second attempt: CPPFLAGS, LDFLAGS and mk/build.mk as above, and I also gave --with-gmp-{includes,libraries} to ./configure. This ends with the same error. What happens if you only use the flags? Third attempt: all of the above, plus I modified mkdependC.prl to omit dependencies on absolute paths. (I suspect this is wrong, since I assume other people have successfully used gmp in a nonstandard path without having to patch anything, but anyway...) New failure: ../../compiler/stage1/ghc-inplace -Iinclude -package rts-1.0 -optc-O2 -odir dist/build -c cbits/longlong.c -o dist/build/cbits/longlong.o In file included from /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/compile/src/ghc-6.8.2/includes/Stg.h:150, from /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/compile/src/ghc-6.8.2/includes/Rts.h:19, from cbits/longlong.c:29:0: /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/compile/src/ghc-6.8.2/includes/Regs.h:28:17: error: gmp.h: No such file or directory I would have thought the SRC_HC_OPTS options I put in mk/build.mk would be used for this compilation, but apparently not. Does anyone see where the problem might be? This is compiling a C file, so it uses SRC_CC_OPTS instead. Thanks Ian ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: building against gmp in a nonstandard location
Ian Lynagh [EMAIL PROTECTED] wrote: On Sun, Mar 16, 2008 at 12:05:03AM -0400, Paul Jarc wrote: make[1]: *** No rule to make target `stage1//package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/gmp/include/gmp.h', needed by `stage1/parser/cutils.o'. Stop. Where is gmp.h actually installed? /package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/gmp/include/gmp.h Do you know where this dependency is coming from? Is it compiler/.depend-1? Yes, and .depend-BASE before that. The only difference in .depend-1 is that stage1/ is prepended. compiler/parser/cutils.c includes includes/Rts.h, which includes gmp.h. So mkdependC puts that information in .depend-BASE. I'm sure it doesn't belong there, since gmp.h isn't part of the GHC source tree, and I guess if it works for other people, then it must not be there for them (can anyone confirm that?), but I don't see how it gets filtered out for them. Second attempt: CPPFLAGS, LDFLAGS and mk/build.mk as above, and I also gave --with-gmp-{includes,libraries} to ./configure. This ends with the same error. What happens if you only use the flags? Do you mean CPPFLAGS and LDFLAGS, without putting anything in mk/build.mk or specifying --with-gmp-*? That fails here: Creating ghcplatform.h... Done. ../utils/mkdependC/mkdependC -f .depend -- -O -DTABLES_NEXT_TO_CODE -I. -I../rts-- mkDerivedConstants.c gcc -O -DTABLES_NEXT_TO_CODE -I. -I../rts-c mkDerivedConstants.c -o mkDerivedConstants.o In file included from Stg.h:150, from Rts.h:19, from mkDerivedConstants.c:23: Regs.h:28:17: error: gmp.h: No such file or directory In file included from Stg.h:150, from Rts.h:19, from mkDerivedConstants.c:23: Regs.h:121: error: expected specifier-qualifier-list before 'MP_INT' In file included from mkDerivedConstants.c:23: Rts.h:203: error: expected ')' before '*' token Rts.h:204: error: expected ')' before '*' token mkDerivedConstants.c: In function 'main': mkDerivedConstants.c:218: error: 'StgRegTable' has no member named 'rRet' mkDerivedConstants.c:218: error: 'StgRegTable' has no member named 'rRet' mkDerivedConstants.c:222: error: 'StgRegTable' has no member named 'rmp_tmp1' mkDerivedConstants.c:223: error: 'StgRegTable' has no member named 'rmp_tmp2' mkDerivedConstants.c:224: error: 'StgRegTable' has no member named 'rmp_result1' mkDerivedConstants.c:225: error: 'StgRegTable' has no member named 'rmp_result2' mkDerivedConstants.c:417: error: 'MP_INT' undeclared (first use in this function) mkDerivedConstants.c:417: error: (Each undeclared identifier is reported only once mkDerivedConstants.c:417: error: for each function it appears in.) mkDerivedConstants.c:418: error: expected expression before ')' token mkDerivedConstants.c:418: error: expected expression before ')' token mkDerivedConstants.c:419: error: expected expression before ')' token mkDerivedConstants.c:419: error: expected expression before ')' token mkDerivedConstants.c:420: error: expected expression before ')' token mkDerivedConstants.c:420: error: expected expression before ')' token mkDerivedConstants.c:422: error: 'mp_limb_t' undeclared (first use in this function) make[1]: *** [mkDerivedConstants.o] Error 1 make: *** [stage1] Error 1 This is compiling a C file, so it uses SRC_CC_OPTS instead. Actually, it would be GHC_CC_OPTS, since it's ghc compiling a C file, right? I tried adding the -optc-I flags, etc., to GHC_CC_OPTS in mk/build.mk, and it gets farther now, but still fails: Configuring hpc-0.5.0.0... rm -f hpc/GNUmakefile cp Makefile.local hpc if ifBuildable/ifBuildable hpc; then \ cd hpc setup/Setup makefile -f GNUmakefile; \ fi In file included from /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+3/compile/src/ghc-6.8.2/includes/Stg.h:150, from /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+3/compile/src/ghc-6.8.2/includes/Rts.h:19, from Reflect.hsc:18: /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+3/compile/src/ghc-6.8.2/includes/Regs.h:28:17: error: gmp.h: No such file or directory This seems to be coming from an invocation of hsc2hs for dependency generation. I don't know how to add extra options to that invocation, since it isn't directly in any of the Makefiles. But if it works for everyone else, I shouldn't need to, either... paul ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: building against gmp in a nonstandard location
On Thu, Mar 20, 2008 at 01:29:26PM -0400, Paul Jarc wrote: Ian Lynagh [EMAIL PROTECTED] wrote: On Sun, Mar 16, 2008 at 12:05:03AM -0400, Paul Jarc wrote: Second attempt: CPPFLAGS, LDFLAGS and mk/build.mk as above, and I also gave --with-gmp-{includes,libraries} to ./configure. This ends with the same error. What happens if you only use the flags? Do you mean CPPFLAGS and LDFLAGS, without putting anything in mk/build.mk or specifying --with-gmp-*? That fails here: Sorry, I meant --with-gmp-{includes,libraries}. if ifBuildable/ifBuildable hpc; then \ cd hpc setup/Setup makefile -f GNUmakefile; \ fi In file included from /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+3/compile/src/ghc-6.8.2/includes/Stg.h:150, from /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+3/compile/src/ghc-6.8.2/includes/Rts.h:19, from Reflect.hsc:18: /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+3/compile/src/ghc-6.8.2/includes/Regs.h:28:17: error: gmp.h: No such file or directory This seems to be coming from an invocation of hsc2hs I think this is something we fixed in the 6.8 darcs branch since releasing 6.8.2. Try building a recent 6.8 branch snapshot instead: http://www.haskell.org/ghc/dist/stable/dist/ Thanks Ian ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: building against gmp in a nonstandard location
Ian Lynagh [EMAIL PROTECTED] wrote: Sorry, I meant --with-gmp-{includes,libraries}. Ah, ok. --with-gmp-* alone, without setting environment variables, creating mk/build.mk, or anything else, gives the same error as --with-gmp-* plus environment variables (No rule to make target `stage1//.../gmp.h', needed by `stage1/parser/cutils.o'.) Trying 6.8.2.20080317, using --with-ghc-*, setting environment variables, creating mk/build.mk, and hacking mkdependC.prl to omit dependencies on absolute paths, I get past the gmp problems, only to run into a similar problem with readline. I've put the appropriate flags for readline in $CPPFLAGS/$LDFLAGS and mk/build.mk, and the build ends with: Configuring readline-1.0.1.0... if ifBuildable/ifBuildable readline; then \ cd readline \ cmp -s ../Makefile.local Makefile.local || cp ../Makefile.local .; \ mv GNUmakefile GNUmakefile.tmp; \ setup/Setup makefile -f GNUmakefile; \ cmp -s GNUmakefile GNUmakefile.tmp mv GNUmakefile.tmp GNUmakefile; \ make -wr \ setup/Setup register --inplace; \ fi mv: cannot stat `GNUmakefile': No such file or directory In file included from Readline.hsc:36: include/HsReadline.h:10:31: error: readline/readline.h: No such file or directory include/HsReadline.h:11:30: error: readline/history.h: No such file or directory Readline.hsc: In function 'main': Readline.hsc:614: error: 'ISFUNC' undeclared (first use in this function) Readline.hsc:614: error: (Each undeclared identifier is reported only once Readline.hsc:614: error: for each function it appears in.) Readline.hsc:616: error: 'ISMACR' undeclared (first use in this function) Readline.hsc:618: error: 'ISKMAP' undeclared (first use in this function) Readline.hsc:723: error: 'UNDO_DELETE' undeclared (first use in this function) Readline.hsc:724: error: 'UNDO_INSERT' undeclared (first use in this function) Readline.hsc:725: error: 'UNDO_BEGIN' undeclared (first use in this function) Readline.hsc:726: error: 'UNDO_END' undeclared (first use in this function) Readline.hsc:1089: error: 'MULT_MATCH' undeclared (first use in this function) Readline.hsc:1112: error: 'SINGLE_MATCH' undeclared (first use in this function) compiling dist/build/System/Console/Readline_hsc_make.c failed command was: gcc -c -D__GLASGOW_HASKELL__=608 -I/fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/compile/src/ghc-6.8.2.20080317/includes -I/fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/compile/src/ghc-6.8.2.20080317/gmp/gmpbuild -I/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/conf/gmp/include -D__GLASGOW_HASKELL__=608 -Iinclude dist/build/System/Console/Readline_hsc_make.c -o dist/build/System/Console/Readline_hsc_make.o Preprocessing library readline-1.0.1.0... make[2]: Entering directory `/fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/compile/src/ghc-6.8.2.20080317/libraries/readline' make[2]: *** No targets specified and no makefile found. Stop. make[2]: Leaving directory `/fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/compile/src/ghc-6.8.2.20080317/libraries/readline' make[1]: *** [make.library.readline] Error 2 make[1]: Leaving directory `/fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/compile/src/ghc-6.8.2.20080317/libraries' make: *** [stage1] Error 2 So next, I tried also passing --with-readline-{includes,libraries} to the toplevel ./configure. As I hoped, those arguments were relayed to the configure script for readline, but that still isn't enough: Configuring readline-1.0.1.0... if ifBuildable/ifBuildable readline; then \ cd readline \ cmp -s ../Makefile.local Makefile.local || cp ../Makefile.local .; \ mv GNUmakefile GNUmakefile.tmp; \ setup/Setup makefile -f GNUmakefile; \ cmp -s GNUmakefile GNUmakefile.tmp mv GNUmakefile.tmp GNUmakefile; \ make -wr \ setup/Setup register --inplace; \ fi mv: cannot stat `GNUmakefile': No such file or directory /package/host/code.dogmap.org/foreign/gcc-4.2.3+spf+0/conf/binutils/command/ld: cannot find -lncurses collect2: ld returned 1 exit status linking dist/build/System/Console/Readline_hsc_make.o failed command was: gcc -lreadline -lncurses -L/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/conf/readline/library dist/build/System/Console/Readline_hsc_make.o -o dist/build/System/Console/Readline_hsc_make Preprocessing library readline-1.0.1.0... make[2]: Entering directory `/fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/compile/src/ghc-6.8.2.20080317/libraries/readline' make[2]: *** No targets specified and no makefile found. Stop. make[2]: Leaving directory `/fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2.20080317+spf+0/compile/src/ghc-6.8.2.20080317/libraries/readline
building against gmp in a nonstandard location
I'm having some trouble building ghc against gmp installed in an unusual place. (Actually, I install every package in its own directory, but gmp seems to be the one causing trouble here.) First attempt: adding the necessary -I, -L, and -Xlinker -R options to CPPFLAGS/LDFLAGS when invoking ./configure, and I also created mk/build.mk to add the -I flags to SRC_CC_OPTS and SRC_HSC2HS_OPTS, and -optc-I, -optl-L, and -optl-Xlinker -optl-R flags to SRC_HC_OPTS. This method worked for ghc 6.6.1, but 6.8.2 gives me: /command/ghc -H16m -O -O2 -optc-I/nil -optc-I/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/gmp/include -optc-I/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/ncurses/include -optc-I/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/readline/include -optc-I/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/binutils/include -optc-pipe -optl-pipe -optc-Os -optl-Os '-optc-march=pentium3' '-optl-march=pentium3' '-optc-mfpmath=sse' '-optl-mfpmath=sse' -optc-msse -optl-msse -optc-mmmx -optl-mmmx -optl-L/nil -optl-L/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/gmp/library -optl-L/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/ncurses/library -optl-L/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/readline/library -optl-L/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/binutils/library -optl-Xlinker -optl-R -optl-Xlinker -optl/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/gmp/library -optl-Xlinker -optl-R -optl-Xlinker -optl/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/ncurses/library -optl-Xlinker -optl-R -optl-Xlinker -optl/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/readline/library -optl-Xlinker -optl-R -optl-Xlinker -optl/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/binutils/library -istage1/utils -istage1/basicTypes -istage1/types -istage1/hsSyn -istage1/prelude -istage1/rename -istage1/typecheck -istage1/deSugar -istage1/coreSyn -istage1/vectorise -istage1/specialise -istage1/simplCore -istage1/stranal -istage1/stgSyn -istage1/simplStg -istage1/codeGen -istage1/main -istage1/profiling -istage1/parser -istage1/cprAnalysis -istage1/ndpFlatten -istage1/iface -istage1/cmm -istage1/nativeGen -Wall -fno-warn-name-shadowing -fno-warn-orphans -Istage1 -cpp -fglasgow-exts -fno-generics -Rghc-timing -I. -Iparser -package unix -ignore-package lang -recomp -Rghc-timing -H16M '-#include cutils.h' -DUSING_COMPAT -i../compat -ignore-package Cabal -package directory -package pretty -package containers -c stranal/StrictAnal.lhs -o stage1/stranal/StrictAnal.o -ohi stage1/stranal/StrictAnal.hi ghc: 34287668 bytes, 7 GCs, 122880/180224 avg/max bytes residency (2 samples), 19M in use, 0.02 INIT (0.00 elapsed), 0.20 MUT (0.33 elapsed), 0.10 GC (0.10 elapsed) :ghc make[1]: *** No rule to make target `stage1//package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/conf/gmp/include/gmp.h', needed by `stage1/parser/cutils.o'. Stop. Second attempt: CPPFLAGS, LDFLAGS and mk/build.mk as above, and I also gave --with-gmp-{includes,libraries} to ./configure. This ends with the same error. Third attempt: all of the above, plus I modified mkdependC.prl to omit dependencies on absolute paths. (I suspect this is wrong, since I assume other people have successfully used gmp in a nonstandard path without having to patch anything, but anyway...) New failure: ../../compiler/stage1/ghc-inplace -Iinclude -package rts-1.0 -optc-O2 -odir dist/build -c cbits/longlong.c -o dist/build/cbits/longlong.o In file included from /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/compile/src/ghc-6.8.2/includes/Stg.h:150, from /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/compile/src/ghc-6.8.2/includes/Rts.h:19, from cbits/longlong.c:29:0: /fs/pkgs/mount/package/host/code.dogmap.org/foreign/ghc-6.8.2+spf+0/compile/src/ghc-6.8.2/includes/Regs.h:28:17: error: gmp.h: No such file or directory I would have thought the SRC_HC_OPTS options I put in mk/build.mk would be used for this compilation, but apparently not. Does anyone see where the problem might be? paul ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
gmp
I understand that gmp is needed for the certain libraries like the Prelude with Double and Integer. But I do not understand why gmp is so deeply buried in the rts. Are the basic types Int and Pointer not enough to write a compiler like ghc? Cheers Christian ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
Don Stewart wrote: However, its buried in the rts/distributed with the runtime, so that users may optionally use that version, rather than finding and installing their own external gmp package. On almost all platforms though, the distributed-with-ghc gmp is unused. But doesn't that mean that it gets linked into any program compiled with ghc? If so, that is not a good choice for a just-in-case mp lib. -Yitz ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
Christian.Maeder: I understand that gmp is needed for the certain libraries like the Prelude with Double and Integer. But I do not understand why gmp is so deeply buried in the rts. Are the basic types Int and Pointer not enough to write a compiler like ghc? Integer is a good type :) However, its buried in the rts/distributed with the runtime, so that users may optionally use that version, rather than finding and installing their own external gmp package. On almost all platforms though, the distributed-with-ghc gmp is unused. -- Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
phercek: Christian Maeder wrote: I understand that gmp is needed for the certain libraries like the Prelude with Double and Integer. Why is GMP needed for Double? Based on the online report Double is double precision floating; it does not need to represent arbitrary big numbers. I thought it is there only for Integers and Rationals. It isn't needed for Double. Double is a native type. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
gale: Don Stewart wrote: However, its buried in the rts/distributed with the runtime, so that users may optionally use that version, rather than finding and installing their own external gmp package. On almost all platforms though, the distributed-with-ghc gmp is unused. But doesn't that mean that it gets linked into any program compiled with ghc? If so, that is not a good choice for a just-in-case mp lib. only on systems that, when ghc was built, there was no libgmp available. on any system where an external libgmp is available, it will be dynamically linked into the generated haskell programs, and in-tree gmp isn't used at all (or compiled, or installed) e.g $ ldd `which xmonad` /home/dons/bin/xmonad: StartEnd Type Open Ref GrpRef Name 48cc 490e3000 rlib 01 0 /usr/lib/libpthread.so.8.0 50424000 5083d000 rlib 01 0 /usr/lib/libm.so.2.3 42a92000 42ece000 rlib 01 0 /usr/local/lib/libgmp.so.7.0 4e3f2000 4e8c4000 rlib 01 0 /usr/lib/libc.so.42.0 -- Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
Don Stewart wrote: on any system where an external libgmp is available, it will be dynamically linked into the generated haskell programs, and in-tree gmp isn't used at all (or compiled, or installed) So on linux and *bsd, that should be fine. On Mac OS X (as a special case of *bsd), we have that framework thing. It is a real pain, and it also requires the framework on any client machine where you install your ghc-compiled binary. OK for the time being, but it would be really, really good to be able to compile ghc without gmp. This idea of a Mac OS X binary with statically-linked gmp is nice, it is really convenient. But someone needs to completely clarify the license issues in that case, and make it completely clear to all users. What is the situation on Windows? Does the standard GHC binary on Windows have dynamically linked gmp for binaries produced by ghc? Thanks, Yitz ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
Christian Maeder wrote: I understand that gmp is needed for the certain libraries like the Prelude with Double and Integer. Why is GMP needed for Double? Based on the online report Double is double precision floating; it does not need to represent arbitrary big numbers. I thought it is there only for Integers and Rationals. Peter. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
Hi What is the situation on Windows? Does the standard GHC binary on Windows have dynamically linked gmp for binaries produced by ghc? No, they are statically linked. (Please, can no one start discussing licensing, people already know there are issues with it, and I get plenty of traffic already!) Thanks Neil ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
Hello, On Thursday 17 January 2008 17:57, Christian Maeder wrote: I understand that gmp is needed for the certain libraries like the Prelude with Double and Integer. But I do not understand why gmp is so deeply buried in the rts. Are the basic types Int and Pointer not enough to write a compiler like ghc? You probably know about this, but just to make sure: Did you take a look at http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes? Cheers Christian ... Best regards Thorkil ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: gmp
Yitzchak Gale: OK for the time being, but it would be really, really good to be able to compile ghc without gmp. Well, just go ahead and write an alternative portable high- performance implementation of Integer. This idea of a Mac OS X binary with statically-linked gmp is nice, it is really convenient. But someone needs to completely clarify the license issues in that case, and make it completely clear to all users. I completely agree that we need to document the situation clearly. Otherwise, I don't really see what the fuss is about. Most GHC users don't care whether GMP is linked into their code (as its either only used internally or has a GPL-compatible license anyway). If a company wants to distribute GHC compiled binaries of non-GPL compatible code, well, they have to compile their own version of GHC on the Mac that links GMP dynamically, and then, use that version of GHC to link their final product. That is going to be a trivial task compared to developing that product in the first place. So, who cares? Manuel ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: FFI Bindings to Libraries using GMP
Hello Benedikt, I apologise for the late reply. I am travelling tomorrow but I will try to get something an alpha implementation out by this Wednesday. For now here are some preliminary answers: On Sep 28, 2007, at 7:41 AM, Benedikt Huber wrote: Am 18.09.2007 um 05:49 schrieb Peter Tanski: The best solution would be to revamp the way Integer types are implemented, so when possible they are mutable under the hood, much like using the binary += instead of the ternary +. Enumerations like the test in [1], below, would not be mutable unless there were some information such as a good consumer function that indicated the intermediate values were only temporarily necessary. I'm not sure if I understand this correctly; Do you want to expose an unsafe/IO interface for destructive Integer manipulation? I would not expose it, just optimise it, in the same way as we can replace recursion with loops at the Cmm level. The end result would involve re-cycling integer memory so you might say that in some equations integers are mutable. (If it is provable that an integer value would not be used again, then it does not seem right not to recycle the memory.) The OpenSSL library is not GPL compatible, so there would be licensing problems for GPL'd system distributions; it is also relatively slow, though it does have a noticeably constant curve for exponential functions. Maybe you should add a note to http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/ PerformanceMeasurements. The statistics suggest that the OpenSSL BN has comparable performance to the GMP, especially for smaller numbers. Some note about the (very confusing) licensing issues regarding OpenSSL would also be nice. I will add this to the wiki. In short, paragraph 10 of the GPL and paragraph 11 of the LGPL--here I may have the paragraphs wrong-- prohibit placing any additional restrictions on your licensees. OpenSSL places an additional restriction on licensees: you cannot use the name 'OpenSSL' with regard to your product, so the OpenSSL license is incompatible with the GPL/LGPL. [1] Simple Performance Test on (ghc-darwin-i386-6.6.1): Malloc is fast but not nearly as fast as the RTS alloc functions; one thing I have not started is integrating the replacement library with GHC, mostly because the replacement library (on par or faster than GMP) uses SIMD functions whenever possible and they require proper alignment. Ok, it's good to know you're already working on integrating a (native) replacement library. It's workable for now but I need to finish Toom3, a basic FFT, and some specialised division operations. I also need to give Thorkil Naur a crack at it. All of this has been on hold because I have been too selfish and perfectionistic to give anyone what I consider a mess and I have been working too many hours to fix it. (This seems to be a common problem of mine; I intend to change that.) I also performed the test with the datatype suggested by John Meacham (using a gmp library with renamed symbols), data FInteger = FInteger Int# (!ForeignPtr Mpz) but it was around 8x slower, maybe due to the ForeignPtr and FFI overhead, or due to missing optimizations in the code. That is quite an interesting result. Are these safe foreign imports? No. Note that `FInteger' above is even faster than the build-in Integer type for small integers (Ints), so I was talking about allocation of gmp integers. I elaborated the test a little, it now shows consistent results I think [1a]; a lot of performance is lost when doing many allocations using malloc, and even more invoking ForeignPtr finalizers. I found the same thing when I tried that; malloc is slow compared to GC-based alloc. The ForeignPtr finalizers do not always run since-- as far as I know--they are only guaranteed to run before RTS shutdown. I'm still interested in sensible solutions to Bug #311, and maybe nevertheless simple switching to standard gmp allocation (either with finalizers or copying limbs when entering/leaving the gmp) via a compile flag would be the right thing for many applications. I'm also looking forward to see the results of the replacement library you're trying to integrate, and those of haskell Integer implementations. The fastest interim solution I can come up with for you would be to use Isaac Dupree's Haskell-based integer library and set up preprocessor defines so you could build ghc (HEAD) from source and use that. Would that be sufficient for now? Cheers, Pete [1a] Integer Allocation Test allocTest :: Int - `some Integral Type T' allocTest iterations = (iterateT iterations INIT) where iterateT 0 v = v iterateT k v = v `seq` iterateT (k-1) (v+STEP) - Small BigNums Allocation Test (INIT = 2^31, STEP = 10^5, k=10^6) Results (utime samples.sort[3..7].average) on darwin-i386 (dualcore): 0.04s destructive-update C implementation
Re: FFI Bindings to Libraries using GMP
On Sep 14, 2007, at 9:14 AM, Benedikt Huber wrote: | I've been struggling using FFI bindings to libraries which rely on the | GNU Mp Bignum library (gmp). It's an issue that bites very few users, but it bites them hard. It's also tricky, but not impossible, to fix. The combination keeps meaning that at GHC HQ we work on things that affect more people. I doubt we can spare effort to design and implement a fix in the near future -- we keep hoping someone else step up and tackle it! Peter Tanski did exactly that (he's the author of the ReplacingGMPNotes above), but he's been very quiet recently. I don't know where he is up to. Perhaps someone else would like to join in? I apologise for being away. The company I work for has been ramping up for a launch and I have been working very long hours (nights and weekends, too). Thank you for the information - I'm also willing to help, though I'm not too familiar with the GHC internals (yet). I do like the idea of optionally linking with a pure-haskell library, but I'm interested in a solution with comparable performance. Commenting solutions to ticket #311: It goes beyond mere familiarity with the internals: the GMP functions are threaded throughout the RTS and the PrimOps files. Of all the primitive operations, they are the most ubiquitous for interfering in other code. The rough list I put on the ReplacingGMP page is a start but the more I work with the RTS the more little things keep turning up. At the least I should update the page. (2) Using the standard allocation functions for the gmp memory managment (maybe as compile flag) as suggested in http:// www.haskell.org/pipermail/glasgow-haskell-users/2006-July/ 010660.html would also resolve ticket #311. In this case at least the dynamic part of gmp integers has to be resized using external allocation functions, and a finalizer (mpz_clear) has to be called when an Integer is garbage collected. It seems that the performance loss by using malloc is significant [1], as lots of allocations and reallocations of very small chunks occur in a functional setting; some kind of (non garbage collected !) memory pool allocation would certainly help. I'm not sure what overhead is associated with calling a finalizer ? The problem of lots of small allocations affects the garbage collector, as well. In the current implementation, each GMP operation calls doYouWantToGC()--I'm sure you have seen the note in PrimOps.cmm, for example--which may act as a stop-world garbage collection. The byte arrays for GMP are also pinned. Compared to this, a FFI implementation using finalizers, which have horrible but practical guarantees on when they are called, would work much better. The best solution would be to revamp the way Integer types are implemented, so when possible they are mutable under the hood, much like using the binary += instead of the ternary +. Enumerations like the test in [1], below, would not be mutable unless there were some information such as a good consumer function that indicated the intermediate values were only temporarily necessary. (3) So when replacing GMP with the BN library of OpenSSL (see http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/ PerformanceMeasurements), it would propably be neccessary to refactor the library, so custom allocation can be used as well. This does not seem too difficult at a first glance though. The OpenSSL library is not GPL compatible, so there would be licensing problems for GPL'd system distributions; it is also relatively slow, though it does have a noticeably constant curve for exponential functions. The one problem you will find with _all_ potential replacement libraries is incompatible behaviour for bitwise functions: they are implemented arithmetically in GMP but logically elsewhere (when they are implemented at all). (Note: if you are looking for the left-shift and right-shift operations in GMP, they are hidden in mpz_mul_2exp and mpz_t_div_q_2exp.) LibTomMath, for example uses pure logical shifts which do not produce correct results. I could go on about many other small differences but the end result is that you would have to do a lot of hacking to get a library that would replace all the functionality GMP provides. That is why I started a replacement from scratch. So I'd like to investigate the second or third option, as far as my knowledge and time permits it. Of course it would be wise to check first if Peter Tanski is already/still working on a GMP replacement. I left off working on it for some time, but things may slow down a little for now so I will (hopefully) have time to package it up. I meant to do that more than a month ago for Thorkil, who has written a multi-precision integer library before and wanted to help. [1] Simple Performance Test on (ghc-darwin-i386-6.6.1): The haskell function (k
Re: FFI Bindings to Libraries using GMP
Peter Tanski wrote: The one problem you will find with _all_ potential replacement libraries is incompatible behaviour for bitwise functions: they are implemented arithmetically in GMP but logically elsewhere (when they are implemented at all). I don't fully understand this... I made sure my Haskell implementation's Bits was consistent with the existing (GMP) version, which you call arithmetically... it seemed that any other semantics would reveal (depend on) implementation details. and I was able to make it reasonably asymptotically-efficient anyway. Isaac ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: FFI Bindings to Libraries using GMP
I won't be at the hackathon I fear (and neither will Simon M) but I jotted down some notes about how to replace GMP with a Haskell library http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/HaskellLibrary Simon | -Original Message- | From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On | Behalf Of Matthias Neubauer | Sent: 11 September 2007 22:04 | To: Simon Peyton-Jones | Cc: glasgow-haskell-users@haskell.org; benedikth; Peter Tanski | Subject: Re: FFI Bindings to Libraries using GMP | | Simon Peyton-Jones [EMAIL PROTECTED] writes: | | Peter Tanski did exactly that (he's the author of the | ReplacingGMPNotes above), but he's been very quiet recently. I | don't know where he is up to. Perhaps someone else would like to | join in? | | Since I live only five minutes away from this year's Hackathon venue, | I guess it would be kind of silly to have such an event in my | neighborhood and not go to it! :) | | I would enjoy working on this during the hackathon. Is there a reason | to not start with a simple, Haskell-only implementation? And later use | techniques known from ByteStrings and stream fusion to improve on that. | | -Matthias | | -- | Matthias Neubauer | | Universität Freiburg, Institut für Informatik | tel +49 761 203 8060 | Georges-Köhler-Allee 79, 79110 Freiburg i. Br., Germany | fax +49 761 203 8052 | ___ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: FFI Bindings to Libraries using GMP
| I've been struggling using FFI bindings to libraries which rely on the | GNU Mp Bignum library (gmp) - this is apparently a well known problem | (http://hackage.haskell.org/trac/ghc/ticket/311, | http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes). | I do rely on using such libraries, and so started to get it work on | MacOsX; no solution was close to being satisfactory however, so I'd like | to ask for some advice. It's an issue that bites very few users, but it bites them hard. It's also tricky, but not impossible, to fix. The combination keeps meaning that at GHC HQ we work on things that affect more people. I doubt we can spare effort to design and implement a fix in the near future -- we keep hoping someone else step up and tackle it! Peter Tanski did exactly that (he's the author of the ReplacingGMPNotes above), but he's been very quiet recently. I don't know where he is up to. Perhaps someone else would like to join in? Meanwhile I've added your comments to #311 so they stay with the ticket. Simon | | Those two options worked to some extend: | (1) | Create or modify the library in question, so gmp is statically linked | and its symbols are hidden. | When source code is available, this is relatively easy, altough it | requires modification of the build process (which can be a hassle). If a | static ar archive is available, it is cumbersome (at least on Mac Os X | I ran into a lot of troubles using nmedit), but possible. Furthermore, | the resulting libraries are bloated, as each of them contains a copy of | the GMP; left alone portability issues. | (2) | As suggested in ticket#311, I tried switching the allocator functions | when switching to FFI. It worked, but not in GHCi; also, doing it | manually is a lot of work, because, as far as I could figure out, it is | neccessary to write a wrapper function for every C-function (indirectly) | using the gmp. | | Furthermore, most of the libraries expose gmp datatypes (mpz_t,mpq_t) in | their API. I currently use a little haskell module working on | GHC.Exts, but that's propably not a good option from a maintainer's | point of view. There are certainly other possibilities, but I couldn't | find one which is both maintainable and portable. | | I would be very grateful for any advice, or some information on plans | for resolving ticket #311. There are some great libraries (like the | Parma Polyhedral Library, to pick an example) out there using gmp, and | it would be nice if writing bindings to those libs could be simplified. | | Thanks, | Benedikt Huber | | ___ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: FFI Bindings to Libraries using GMP
Simon Peyton-Jones [EMAIL PROTECTED] writes: Peter Tanski did exactly that (he's the author of the ReplacingGMPNotes above), but he's been very quiet recently. I don't know where he is up to. Perhaps someone else would like to join in? Since I live only five minutes away from this year's Hackathon venue, I guess it would be kind of silly to have such an event in my neighborhood and not go to it! :) I would enjoy working on this during the hackathon. Is there a reason to not start with a simple, Haskell-only implementation? And later use techniques known from ByteStrings and stream fusion to improve on that. -Matthias -- Matthias Neubauer | Universität Freiburg, Institut für Informatik | tel +49 761 203 8060 Georges-Köhler-Allee 79, 79110 Freiburg i. Br., Germany | fax +49 761 203 8052 ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: FFI Bindings to Libraries using GMP
neubauer: Simon Peyton-Jones [EMAIL PROTECTED] writes: Peter Tanski did exactly that (he's the author of the ReplacingGMPNotes above), but he's been very quiet recently. I don't know where he is up to. Perhaps someone else would like to join in? Since I live only five minutes away from this year's Hackathon venue, I guess it would be kind of silly to have such an event in my neighborhood and not go to it! :) I would enjoy working on this during the hackathon. Is there a reason to not start with a simple, Haskell-only implementation? And later use techniques known from ByteStrings and stream fusion to improve on that. -Matthias Oh, Matthias: if you're coming to the hackathon, please register so we can get you a tshirt! :) Details on the hackathon page. http://www.haskell.org/haskellwiki/Hac_2007_II -- Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
FFI and Libraries using GMP
Hello, I've been struggling using FFI bindings to libraries which rely on the GNU Mp Bignum library (gmp) - this is apparently a well known problem (http://hackage.haskell.org/trac/ghc/ticket/311). I do want to use some of those libraries however and so I started to get them work on ghc-6.6.1/darwin_8_i368. But as no solution was close to being satisfactory, I'd like to ask for some advice. Those two options worked to some extend: (1) Create or modify the library in question, so gmp is statically linked and its symbols are hidden. When source code is available, this is relatively easy, altough it requires modification of the build process (which can be a hassle). If a static ar archive is available, it is cumbersome (at least on Mac Os X I ran into a lot of troubles using nmedit), but possible. Furthermore, the resulting libraries are bloated, as each of them contains a copy of the GMP; left alone portability issues. (2) As suggested in ticket#311, I tried switching the allocator functions when switching to FFI. It worked, but not in GHCi; also, doing it manually is a lot of work, because, as far as I could figure out, it is neccessary to write a wrapper function for every C-function (indirectly) using the gmp. Furthermore, most of the libraries expose gmp datatypes (mpz_t,mpq_t) in their API. I currently use a little haskell module working on GHC.Exts, but that's propably not a good option from a maintainer's point of view. There are certainly other possibilities, but I couldn't find one which is both maintainable and portable. I would be very grateful for any advice, or some information on plans for resolving ticket #311; I'm aware that there are long-term plans for replacing gmp, but I'd prefer a seperate short-term solution for the problems with the FFI. There are some great libraries (like the Parma Polyhedral Library, to pick an example) out there using gmp, and it would be nice if writing bindings to those libs could be simplified. Thanks, Benedikt ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
FFI Bindings to Libraries using GMP
Hello, I've been struggling using FFI bindings to libraries which rely on the GNU Mp Bignum library (gmp) - this is apparently a well known problem (http://hackage.haskell.org/trac/ghc/ticket/311, http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes). I do rely on using such libraries, and so started to get it work on MacOsX; no solution was close to being satisfactory however, so I'd like to ask for some advice. Those two options worked to some extend: (1) Create or modify the library in question, so gmp is statically linked and its symbols are hidden. When source code is available, this is relatively easy, altough it requires modification of the build process (which can be a hassle). If a static ar archive is available, it is cumbersome (at least on Mac Os X I ran into a lot of troubles using nmedit), but possible. Furthermore, the resulting libraries are bloated, as each of them contains a copy of the GMP; left alone portability issues. (2) As suggested in ticket#311, I tried switching the allocator functions when switching to FFI. It worked, but not in GHCi; also, doing it manually is a lot of work, because, as far as I could figure out, it is neccessary to write a wrapper function for every C-function (indirectly) using the gmp. Furthermore, most of the libraries expose gmp datatypes (mpz_t,mpq_t) in their API. I currently use a little haskell module working on GHC.Exts, but that's propably not a good option from a maintainer's point of view. There are certainly other possibilities, but I couldn't find one which is both maintainable and portable. I would be very grateful for any advice, or some information on plans for resolving ticket #311. There are some great libraries (like the Parma Polyhedral Library, to pick an example) out there using gmp, and it would be nice if writing bindings to those libs could be simplified. Thanks, Benedikt Huber ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Current GMP implementation
Hello Thorkil, I updated the wiki Replacing GMP/The Current GMP Implementation page with some introductory notes. I hope the pretty graphic cuts down on a lot of wording necessary. There is no discussion of the Cmm implementation as that is contained in Esa's posts to the GHC users list or in the PrimOps pages in the new Commentary. (Maybe I should move the GMP-Cmm-Haskell discussion into The Current GMP Implementation page but I really have to get some honest coding done.) Cheers, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
On Dec 29, 2006, at 8:32 AM, Thorkil Naur wrote: On Friday 01 September 2006 06:43, Peter Tanski wrote: ... For a brief overview of the speed of the libraries I looked at carefully, see http://hackage.haskell.org/trac/ghc/wiki/ ReplacingGMPNotes (I added a few charts to show the speed of ARPREC, OpenSSL's BN, GMP and LibTomMath. ) I tested GMP and OpenSSL's BN for reference. I must confess that it took me a while to understand what you were doing here and I am still uncertain: For example, how many multiplications were actually performed for bit size 4096? 4096 = size 4 (counting from zero: sizes[5] = {256,512,1024,2048,4096}) 1,000,000 / ( 4 * 3 ) = 83,333 rounds My reason for doing this was simple: as a basic comparison, rather than an absolute measurement, the number of rounds doesn't matter as long as the results are measurable. Besides, more rounds means more time running each test. I did a lot of tweaking, especially with ARPREC, to get each library to (1) a generally available speed and (2) a configuration similar to what it would be when used with GHC. I could have measured the speed in nanoseconds, with one iteration for each calculation using random numbers of a specified size and posting the mean for a number of trials but that would have required me to use OS-X specific profiling software like Shark in order to get reliable measurements--a bit more labour intensive as it would require me to manually perform each test. (I did use random numbers of a specified size.) In addition, for Powers, the markings on the horizontal axis (256 pow(n,3), 512 pow(n,4), 1024 pow (n5) (missing a , here?), ...) on your graph seem to indicate that you are changing two different variables (the bit size and the exponent) at the same time. Yes, the testing is a bit sloppy there (so is the graph; ugly typo). The graph shows a general trend more than anything else. I actually tested the Exponentials (Powers) individually for each size and timing but posting a 3-D graph or making the graph (time = exponent/ size) seemed like overkill or would conflict with the previous measurements. Not a bad idea, though, just for clarity. I would suggest that you also quoted the original measurements directly. And perhaps (especially in case of the Powers and Division) some more details about what was actually measured. I did quote the original measurements directly. There wasn't much variance overall and I took what seemed like median results from a number of tests. What matters is the relative time to initialise and perform each computation since in a GHC-implementation each computation would require some minimal initialisation. ARPREC was built for this and in ARPREC-only tests the major factor in speed of initialisation was the time to compute the architecture and precision- specific constants for PI, Log_2 and Log_10 (the Epsilon constant doesn't require much time). Log_2 and Log_10 are necessary for printing Integers because computations in ARPREC are performed as floating-point values and must converted to decimal digits by multiplying by Log_10. (Note that printing Integers also requires a size increase as the mantissa holds the Integer value, requiring further multiplication by the float-exponent.) Details on differences between algorithms used in each library would be fairly complex: as you already know, each library (ARPREC, GMP, LibTomMath, etc.) uses a different algorithm based on the size of the number-array *and* each may have a different implementation of an algorithm--LibTomMath uses a simpler Karatsuba algorithm, for example. It is distinctly odd that squaring seems to be more expensive than multiplication for some bit sizes in 3 of the 4 measured cases. This is also an implementation matter: the particular algorithms change with size and squaring may require some extra initialisation for, say, computing the size of the result and the number of operations. All of the libraries provide special squaring functions and I used those. LibTomMath is a good example: it uses its baseline squaring algorithm for small sizes and Comba-optimised Toom-Cook or Karatsuba algorithms for larger sizes. (I purposely did not tune LibTomMath or any of the libraries because I wanted a more- or-less average-case comparison, so the Karatsuba algorithm was used for size=512 bytes (128 digits) and Toom-Cook was used for size=2048 bytes (512 digits).) So where you see LibTomMath's time dip in the middle of the 'Squaring' graph it is using the Karatsuba algorithm. ARPREC uses a FFT for everything above size=256 and calculates with fewer actual digits (it stores extra size as an exponent, just like ordinary floating point numbers). The trends in the graphs for ARPREC versus GMP generally hold true until GMP's FFT kicks in, at size 32768. Also, I wonder what divisions
Re: Replacement for GMP: Update
Hello, Thanks a lot for your reply. Here are some comments to this. As is customary, I must apologise for the late reply (the response time for this conversation seems to increase exponentially with time) which also may very well make some of the comments totally out-dated. On Friday 01 September 2006 06:43, Peter Tanski wrote: ... For a brief overview of the speed of the libraries I looked at carefully, see http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes (I added a few charts to show the speed of ARPREC, OpenSSL's BN, GMP and LibTomMath. ) I tested GMP and OpenSSL's BN for reference. I must confess that it took me a while to understand what you were doing here and I am still uncertain: For example, how many multiplications were actually performed for bit size 4096? In addition, for Powers, the markings on the horizontal axis (256 pow(n,3), 512 pow(n,4), 1024 pow(n5) (missing a , here?), ...) on your graph seem to indicate that you are changing two different variables (the bit size and the exponent) at the same time. I would suggest that you also quoted the original measurements directly. And perhaps (especially in case of the Powers and Division) some more details about what was actually measured. It is distinctly odd that squaring seems to be more expensive than multiplication for some bit sizes in 3 of the 4 measured cases. Also, I wonder what divisions these libraries are capable of carrying out so much faster than comparable multiplications. For the division measurements, I may again simply be presuming something about your experiments that simply isn't true. ... I keep talking about ARPREC--why? For three reasons: (1) I trust its level of precision--this has been tested, see FFTW's benchmark page for accuracy: http://www.fftw.org/accuracy/G4-1.06GHz- gcc4/ I am not sure I understand here: With respect to Haskell Integers, I would not say that there is any concept of a level of precision: If Integer computations are not accurate, they are wrong. (2) if you look at the charts, although ARPREC is bad in division it simply blows GMP, OpenSSL and LibTomMath away: at 4096 bits (85 doubles--each double has conservatively only 48.3 or so bits of integer precision), ARPREC can take a full random number to pow(n,7) in .98 seconds, compared to 77.61 or so seconds for the leader of the Integer-based libraries, GMP. (I ran the tests many times to make sure readings weren't flukes.) Again, some additional details about these measurements would be most welcome. (3) of all the FFT-based arbitrary precision libraries available, ARPREC is the only BSD-licensed one--Takuya Ooura's library (used in MAPM) is only a FFT algorithm and not necessarily either fast or accurate. The rest of the FFT libraries available are essentially GMP-licensed. So I am in an unenviable position: I intend to fulfill my promise and get a replacement for GHC (and maybe more), but I have to essentially build better functionality into the front-runner, ARPREC. At present I have been working with vector-based algorithms that would enable me to use hardware-tuned code for Single Instruction Multiple Data (SIMD) chipsets. Currently I am researching algorithms based on operating-system supplied vector libraries. Part of this modification involves a fast transformation between a vector of large integers and an array of doubles, without loss of precision (although vectors of doubles are also working well, they do not have the same library support.) I am also working on enhancing ARPREC's division algorithm. I looked briefly at ARPREC: It seems that it works with an explicitly set maximum bound on the number of digits desired. Although it also appears that it is possible to change this bound as computations proceed, such additional administration seems difficult to embrace. This is the problem I face: GHC unfortunately does not use Integer as a mathematical operation but as a primitive type, complete with bitwise operations. I do not understand what problem you are referring to here. ... On Aug 24, 2006, at 2:39 PM, Thorkil Naur wrote: I am truly unable to tell what I would consider the ideal situation. On the one hand, I value greatly the freedom of choosing my circumstances without restraints. And I appreciate the desire of noble people like the GHC developers to be equally free of such restraints. This would point in the direction of basing Integer computations on a fairly generic implementation. Which insightful consideration would then force us to admit would probably cost a factor of, say, 2 or more in performance in some cases. The problem I face is: performance at which level? The low-bit-size (low precision) range or the high precision range? Two problems seem to be lurking here: The first is that, although taking a copy of some
Re[2]: bignums, gmp, bytestring, .. ?
Hello John, Tuesday, November 28, 2006, 4:52:09 AM, you wrote: The simple problem with Haskell and Integer is that, according to the Standard, Integer is a primitive: it is consequently implemented as part of the runtime system (RTS), there is no requirement that Integer be a primitive. a plain old implementation in haskell or via the FFI is certainly a valid one. as a part of my Core library i've defined simplest implementation of Integer for poor compilers: newtype Integer = Integer Int plusInteger = liftI2 plusInt ... it's also possible to use [Int] here if some poor compilers will really arrive -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
On Sat, Nov 18, 2006 at 04:46:24PM -0500, Peter Tanski wrote: The simple problem with Haskell and Integer is that, according to the Standard, Integer is a primitive: it is consequently implemented as part of the runtime system (RTS), not the Prelude or any library (though the interface to Integer is in the base library). For GHC, compiling with -fno-implicit-prelude and explicitly importing only those functions and types you need the won't get rid of Integer. Possible solutions would be to implement the Integer 'primitive' as a separate library and import it into the Prelude or base libraries, then perform an optimisation step where base functions are only linked in when needed. Except for the optimisation step, this actually makes the job easier since Integer functions would be called using the FFI and held in ForeignPtrs. (I have already done the FFI- thing for other libraries and a primitive version of the replacement.) there is no requirement that Integer be a primitive. a plain old implementation in haskell or via the FFI is certainly a valid one. John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
On Nov 22, 2006, at 8:39 PM, Donald Bruce Stewart wrote: p.tanski: to my knowledge, that would be the first Haskell implementation for PalmOS... Pretty sure Tony Sloane et al at Macquarie have had nhc98 running on the palm for quite a while. They've recently moved to YHC though, iirc. Donald, Thanks for tip! Have you seen any of this code yourself? Jeremy, Dr. Sloane would be a good person to contact. For reference, the last work on this seems to have been done in 2005. Tony Sloane (asloane -at- ics dot mq dot edu dot au). Patroklos Argyroundis did a port of GMP version 3.1.1 to WinCE, noted at http://ntrg.cs.tcd.ie/~argp/software/ntrg-gmp-wince.html . I don't know if that might give any good pointers. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
On Nov 19, 2006, at 3:20 PM, Jeremy Shaw wrote: At Sun, 19 Nov 2006 13:46:10 -0500, Peter Tanski wrote: What is the problem building GMP for PalmOS? According to the GMP install documentation, it supports ARM and Motorola's m68k processors, so you would not be using generic C code. You are probably also using PRC-Tools, correct? Yes. I can not get past the configure step. I tried to build gmp 4.2.1 with prc tools 2.3. I ran configure with these options: ./configure --build=i386-linux-gnu --host=m68k-palmos But, all the test programs (conftest.c) fail to link because they use 'main ()', but palmos expects 'PilotMain ()'. I hack the configure script and changed all occurances of 'main ()' to 'PilotMain ()', but then it failed beacuse the test programs could not find MemoryMgr.h. So I invoked configure with: CFLAGS=-I=/root/n-heptane/projects/haskell/palmos/sdk-5r3/ include/ ./configure --build=i386-linux-gnu --host=m68k-palmos But now it fails to find working compiler with this error (from config.log): configure:7756: checking build system compiler m68k-palmos-gcc -I=/ root/n-heptane/projects/haskell/palmos/sdk-5r3/include/ configure:7769: m68k-palmos-gcc -I=/root/n-heptane/projects/haskell/ palmos/sdk-5r3/include/ conftest.c /usr/lib/gcc-lib/m68k-palmos/2.95.3-kgpd/libgcc.a(_exit.o)(.text +0x10): In function `exit': libgcc2.c: undefined reference to `_cleanup' /usr/lib/gcc-lib/m68k-palmos/2.95.3-kgpd/libgcc.a(_exit.o)(.text +0x16):libgcc2.c: undefined reference to `_exit' collect2: ld returned 1 exit status Did you set LDFLAGS to point to the prc-tools directory and set the first available 'ld' executable to the prc-tools 'ld'? I would help more but I am using darwinports to build prc-tools (I have not experience with prc-tools or PalmOS) and it fails to build (partly because Apple gcc 4.0.1 defaults to a dynamic C++ library, so prebinding fails... there is an easy workaround so once I get enough CPU-time to rebuild it I will try poking around some more. And, around this time, my interest in running yhi on PalmOS starts to wane. Awww... to my knowledge, that would be the first Haskell implementation for PalmOS :) As I mentioned in a prior email, there is a Haskell arbitrary precision number library (BigFloat, at http:// bignum.sourceforge.net/). You might adapt it for Integer and add it to Yhc if nothing else works. I'm not crazy about BigFloat's performance, though. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
p.tanski: On Nov 19, 2006, at 3:20 PM, Jeremy Shaw wrote: And, around this time, my interest in running yhi on PalmOS starts to wane. Awww... to my knowledge, that would be the first Haskell implementation for PalmOS :) As I mentioned in a prior email, there is a Haskell arbitrary precision number library (BigFloat, at http:// bignum.sourceforge.net/). You might adapt it for Integer and add it to Yhc if nothing else works. I'm not crazy about BigFloat's performance, though. Pretty sure Tony Sloane et al at Macquarie have had nhc98 running on the palm for quite a while. They've recently moved to YHC though, iirc. -- Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
Hi Jeremy, On Nov 17, 2006, at 10:34 PM, Jeremy Shaw wrote: At Sat, 18 Nov 2006 00:44:32 +, Neil Mitchell wrote One advantage you probably haven't thought of is the size of the binary. ... On a related note -- dropping the gmp requirement would also make it easier to port yhc to non-unix platforms. I have tried on a few occasions to compile the yhc runtime for PalmOS, but I can never get gmp built for PalmOS. What is the problem building GMP for PalmOS? According to the GMP install documentation, it supports ARM and Motorola's m68k processors, so you would not be using generic C code. You are probably also using PRC-Tools, correct? Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
At Sun, 19 Nov 2006 13:46:10 -0500, Peter Tanski wrote: What is the problem building GMP for PalmOS? According to the GMP install documentation, it supports ARM and Motorola's m68k processors, so you would not be using generic C code. You are probably also using PRC-Tools, correct? Yes. I can not get past the configure step. I tried to build gmp 4.2.1 with prc tools 2.3. I ran configure with these options: ./configure --build=i386-linux-gnu --host=m68k-palmos But, all the test programs (conftest.c) fail to link because they use 'main ()', but palmos expects 'PilotMain ()'. I hack the configure script and changed all occurances of 'main ()' to 'PilotMain ()', but then it failed beacuse the test programs could not find MemoryMgr.h. So I invoked configure with: CFLAGS=-I=/root/n-heptane/projects/haskell/palmos/sdk-5r3/include/ ./configure --build=i386-linux-gnu --host=m68k-palmos But now it fails to find working compiler with this error (from config.log): configure:7756: checking build system compiler m68k-palmos-gcc -I=/root/n-heptane/projects/haskell/palmos/sdk-5r3/include/ configure:7769: m68k-palmos-gcc -I=/root/n-heptane/projects/haskell/palmos/sdk-5r3/include/ conftest.c /usr/lib/gcc-lib/m68k-palmos/2.95.3-kgpd/libgcc.a(_exit.o)(.text+0x10): In function `exit': libgcc2.c: undefined reference to `_cleanup' /usr/lib/gcc-lib/m68k-palmos/2.95.3-kgpd/libgcc.a(_exit.o)(.text+0x16):libgcc2.c: undefined reference to `_exit' collect2: ld returned 1 exit status And, around this time, my interest in running yhi on PalmOS starts to wane. j. Using autoconf is like playing chess from 20 feet away by flicking a rope to move the pieces -mbp ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
On Nov 17, 2006, at 7:24 PM, Claus Reinke wrote: it seems that haskell versions of bignums is pretty much gone from more recent discussions of gmp replacements. now, I assume that there are lots of optimizations that keep gmp popular that one wouldn't want to have to reproduce, so that a haskell variant might not be competitive even if one had an efficient representation, but - do all those who want to distribute binaries, but not dynamically linked, need bignums? You are right: most don't. Even when working with larger numbers, I have very rarely used bignum libraries myself, mostly because there is usually a clever--and often faster--way to deal with large numbers, especially when you don't require all that extra precision. These methods were better known and relatively widely used before multi-precision libraries became so widespread and have even become more useful since 64-bit machines and C99's 64-bit ints came around. Integers are mostly a convenience. Large numbers are necessary if you need very high precision mathematical calculations or if you are doing cryptography; for that matter, high precision mathematics usually benefits more from arbitrary precision decimal (fixed or floating point) for certain calculations. The simple problem with Haskell and Integer is that, according to the Standard, Integer is a primitive: it is consequently implemented as part of the runtime system (RTS), not the Prelude or any library (though the interface to Integer is in the base library). For GHC, compiling with -fno-implicit-prelude and explicitly importing only those functions and types you need the won't get rid of Integer. Possible solutions would be to implement the Integer 'primitive' as a separate library and import it into the Prelude or base libraries, then perform an optimisation step where base functions are only linked in when needed. Except for the optimisation step, this actually makes the job easier since Integer functions would be called using the FFI and held in ForeignPtrs. (I have already done the FFI- thing for other libraries and a primitive version of the replacement.) - it would be nice to know just how far off a good haskell version would be performance-wise.. There is actually a relatively recent (2005, revised) Haskell version of an old Miranda library for infinite precision floating point numbers by Martin Guy, called BigFloat, at http:// bignum.sourceforge.net/. Of course, it is floating point and Integers would be faster but the general speed difference between the two would probably be proportional to the speed difference in C and so would be just as disappointing. The BigFloat library (using the Haskell version) came in last place at the Many Digits Friendly Competition for 2005 (see http://www.cs.ru.nl/~milad/manydigits/ final_timings_rankings.html), though you would probably be more interested in looking at the actual timing results to get a better idea. (The fastest competitors were MPFR, which uses GMP, and The Wolfram Team, makers of Mathematica; BigFloat actually beat iRRAM and Maple solutions for several problems.) The real problem with an Integer library written in *pure* Haskell-- especially with Integers--is simple: Haskell is too high-level and no current Haskell compiler, even JHC, has even remotely decent support for low-level optimisations such as being able to unroll a loop over two arrays of uint32_t and immediately carry the result from adding the first elements from each array to the addition of the next two, in two machine instructions. I shouldn't have to mention parallelization of operations. In short, if you look at general assembler produced from any Haskell compiler, it is *very* ugly and Arrays are even uglier. (For a simple comparison to Integer problems, try implementing a fast bucket sort in Haskell.) GMP uses hand-written assembler routines for many supported architectures, partly because GMP was originally created for earlier versions of GCC which could not optimise as well as current versions. Even GMP cannot compare to an optimised library using SIMD (Altivec, SSE)--in my tests, SIMD-optimised algorithms are between 2x to 10x faster. SIMD and small assembler routines (especially for architectures without SIMD, especially) are what I have been doing the bulk of my work on. I doubt I have the ability to extend the current state of the art with regard to higher-level polynomial optimisations, so I am always trying out any algorithm I can find. (For very high precision multiplication (more than 30,000 bits), not much beats a SIMD-enabled Fast Fourier Transform; a specially coded Toom-3 algorithm would be faster but for very large operands the algorithm becomes prohibitively complex. Division is another labour- intensive area.) - what would be a killer for numerical programming, might still
bignums, gmp, bytestring, .. ?
it seems that haskell versions of bignums is pretty much gone from more recent discussions of gmp replacements. now, I assume that there are lots of optimizations that keep gmp popular that one wouldn't want to have to reproduce, so that a haskell variant might not be competitive even if one had an efficient representation, but - do all those who want to distribute binaries, but not dynamically linked, need bignums? - it would be nice to know just how far off a good haskell version would be performance-wise.. - what would be a killer for numerical programming, might still be quite acceptable for a substantial part of haskell uses? of course, the real gmp replacement project might be going so well that a haskell version would be obsolete rather sooner than later, and i certainly don't want to interfere with that effort. all that being said, it occurred to me that the representations and fusions described in the nice rewriting haskell strings paper would be a good foundation for a haskell bignum project, wouldn't they? http://www.cse.unsw.edu.au/~dons/fps.html http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes has anyone been looking into this option? just another thought, claus ps. while I'm at it: claiming that array fusion .. has received comparatively little attention sounds a bit dangerous to me, and the references are all too limited - even if you meant in the Haskell world (and PADL is no Haskell event). you might want to try searching with some other search terms, keeping in mind that most of the work will have been done for not so declarative languages: it isn't my area, but loop fusion and with-loop folding come to mind, and I'm sure that all those array language and numerical programming folks would be rather, well, disappointed?, to see their efforts ignored like that. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
claus.reinke: it seems that haskell versions of bignums is pretty much gone from more recent discussions of gmp replacements. now, I assume that there are lots of optimizations that keep gmp popular that one wouldn't want to have to reproduce, so that a haskell variant might not be competitive even if one had an efficient representation, but - do all those who want to distribute binaries, but not dynamically linked, need bignums? - it would be nice to know just how far off a good haskell version would be performance-wise.. - what would be a killer for numerical programming, might still be quite acceptable for a substantial part of haskell uses? of course, the real gmp replacement project might be going so well that a haskell version would be obsolete rather sooner than later, and i certainly don't want to interfere with that effort. all that being said, it occurred to me that the representations and fusions described in the nice rewriting haskell strings paper would be a good foundation for a haskell bignum project, wouldn't they? http://www.cse.unsw.edu.au/~dons/fps.html http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes has anyone been looking into this option? Interesting, what kind of operations can you imagine fusing? just another thought, claus ps. while I'm at it: claiming that array fusion .. has received comparatively little attention sounds a bit dangerous to me, and the references are all too limited - even if you meant in the Haskell world (and PADL is no Haskell event\emph{)., Yes, clearly this is in reference to rewriting-based/combinator-based deforestation. Space constraints meant we couldn't address imperative loop fusion strategies in any depth. Cheers, Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
Hi - do all those who want to distribute binaries, but not dynamically linked, need bignums? - it would be nice to know just how far off a good haskell version would be performance-wise.. - what would be a killer for numerical programming, might still be quite acceptable for a substantial part of haskell uses? One advantage you probably haven't thought of is the size of the binary. Currently GMP adds about 50Kb on to the Yhc runtime, for what in the most cases is probably an occasional addition. If the bytecode for a bignum library was less than this then there could be a substantial size saving. (Of course, Yhc has absolutely no license issues with libgmp, and would have substantially worse performance than GHC at bignum optimisation) Thanks Neil ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
At Sat, 18 Nov 2006 00:44:32 +, Neil Mitchell wrote One advantage you probably haven't thought of is the size of the binary. Currently GMP adds about 50Kb on to the Yhc runtime, for what in the most cases is probably an occasional addition. If the bytecode for a bignum library was less than this then there could be a substantial size saving. On a related note -- dropping the gmp requirement would also make it easier to port yhc to non-unix platforms. I have tried on a few occasions to compile the yhc runtime for PalmOS, but I can never get gmp built for PalmOS. So, a byte-code version could be advantageous there too. I don't plan to do a lot of bignum intensive computation, so it would not be a problem if the byte-code version was 10 or 100 times slower. j. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Update to Replacement GMP page
Hello all, I made another update to the notes on Replacing GMP, at http:// hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes . It's pretty basic and you'd probably find it shabby, but comments, modifications appreciated. I am still in the throes of trying to *portably* beat GMP for speed and accuracy. (The portability problem comes in when I may no longer rely on Altivec/SSE/3DNow! support on other processors and need to optimise in pure C.) -Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Hello Thorkil, I am very sorry for the late reply. I have been extremely busy and I wanted to give you a coherent answer. For a brief overview of the speed of the libraries I looked at carefully, see http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes (I added a few charts to show the speed of ARPREC, OpenSSL's BN, GMP and LibTomMath. I did not add speed tests for Crypto++ and Botan because they don't measure up. The original timings I obtained for them were based on their own benchmarks which are inadequate and (for Crypto++) based on tuned assembly code only available on Pentium4 processors with SSE.) I tested GMP and OpenSSL's BN for reference. Over the past few weeks I tore Crypto++ apart and modified a few things, only to find out that it has a persistent bug: woven throughout the library is a conversion from 32-bit to 64-bit ints using unions. This kind of transformation breaks the C's (and C++'s) aliasing rules (thanks to John Skaller for pointing out the problem), so compiling Crypto++ with optimisations turned on (g++ -O3) introduces failures, especially in the Division algorithms. I could change the unions to bitwise transformations with masks but I would have to really dig out all the references. After some more rigorous timing tests I found that I would have to essentially rewrite the algorithms anyway. What a mess. After some more research I found that there really are no other good public domain or BSD-compatible licensed libraries available. I tested two other free arbitrary precision integer libraries, MPI and MAPM, but they were also too slow, sometimes as much as 4 times slower. MAPM uses a Fast Fourier Transform (FFT) from Takuya Ooura (see http://momonga.t.u-tokyo.ac.jp/~ooura/fft.html) and should have been fast but turned out to be even slower than MPI. If you look at the ReplacingGMPNotes page I mentioned at the top of this email, the charts show that LibTomMath is weak in multiplication--at larger integer sizes (2048-4096 bits) it is half as fast as GMP, or worse. On the other hand ARPREC, which also uses a FFT algorithm, is slow at lower precisions (256-512 bits) for two reasons: (1) at relatively low precisions, ARPREC defaults to faster standard algorithms instead of its FFT and (2) when using its fast FFT at medium levels of precision (512) bits the FFT is too ponderous keep up with the relatively lighter and faster algorithms of the int-based libraries (GMP, OpenSSL's BN and LibTomMath). (As a little history, ARPREC used to be fairly bad relative to other FFT programs available but was redone after 2003 so by 2004 it was fairly good, it is up to version 2.1.94 now and it is much better; if you are looking at ARPREC benchmarks online prior to 2003, they are too old to be good indicators of its present capability.) I keep talking about ARPREC--why? For three reasons: (1) I trust its level of precision--this has been tested, see FFTW's benchmark page for accuracy: http://www.fftw.org/accuracy/G4-1.06GHz- gcc4/ (2) if you look at the charts, although ARPREC is bad in division it simply blows GMP, OpenSSL and LibTomMath away: at 4096 bits (85 doubles--each double has conservatively only 48.3 or so bits of integer precision), ARPREC can take a full random number to pow(n,7) in .98 seconds, compared to 77.61 or so seconds for the leader of the Integer-based libraries, GMP. (I ran the tests many times to make sure readings weren't flukes.) (3) of all the FFT-based arbitrary precision libraries available, ARPREC is the only BSD-licensed one--Takuya Ooura's library (used in MAPM) is only a FFT algorithm and not necessarily either fast or accurate. The rest of the FFT libraries available are essentially GMP-licensed. So I am in an unenviable position: I intend to fulfill my promise and get a replacement for GHC (and maybe more), but I have to essentially build better functionality into the front-runner, ARPREC. At present I have been working with vector-based algorithms that would enable me to use hardware-tuned code for Single Instruction Multiple Data (SIMD) chipsets. Currently I am researching algorithms based on operating-system supplied vector libraries. Part of this modification involves a fast transformation between a vector of large integers and an array of doubles, without loss of precision (although vectors of doubles are also working well, they do not have the same library support.) I am also working on enhancing ARPREC's division algorithm. This is the problem I face: GHC unfortunately does not use Integer as a mathematical operation but as a primitive type, complete with bitwise operations. From my experience, GHC users typically use Integers at lower precisions (typically under 256 bits) and they do not use Integer for higher math. (Integer math operations are basic primitives, as you already know.) I
Re: Replacement for GMP: Update
Hello Peter, Sorry for the late reply. From your latest communication which seems to be Date: Sat, 12 Aug 2006 21:12:05 -0400 From: Peter Tanski [EMAIL PROTECTED] Subject: Re: OpenSSL License (was Replacement for GMP: Update) To: John Goerzen [EMAIL PROTECTED] I am a bit uncertain where the matter stands right now. Nevertheless, I wish to thank you for your reply. Clearly, you have done a lot of work, not least to investigate alternatives. You write I hope I have answered a few of your points; some of them have given me more work but that is all right, I guess. ... And you most certainly have, much more detailed than I could have hoped for. And hopefully, this additional work that you mention is not something that you considered a waste of time. I am also most happy to read that Before I actually bind any library with GHC, I will thoroughly test it as a stand-alone library ... since correctness of Integer computation is by far its most important property. (I would say that performance comes second, then what could be termed interoperability as third.) The support for Integer computation is a messy subject: On the one hand, even children out of the first few years of school are capable of understanding the basic functionality involved. On the other hand, providing efficient support of this functionality, especially if required in a wide selection of circumstances, requires a surprising amount of hard work and insight. I would say, if only really large integers were routinely required in actual, real-world computations, our CPUs would support these computations and there would be no need to consider the question in the, admittedly, fairly limited context of GHC. The situation seems to be that the functionality is perhaps considered obviously available, but the reality is that it can only be attained at significant cost (either real money or honest sweat). I am truly unable to tell what I would consider the ideal situation. On the one hand, I value greatly the freedom of choosing my circumstances without restraints. And I appreciate the desire of noble people like the GHC developers to be equally free of such restraints. This would point in the direction of basing Integer computations on a fairly generic implementation. Which insightful consideration would then force us to admit would probably cost a factor of, say, 2 or more in performance in some cases. On the other hand, the existence of libraries like GMP seems to offer an economic path out of this jungle. However, as the discussion over the past couple of weeks illustrates, the path may be unacceptably thorny. Let me quote some additional statements from this debate: On Thursday 10 August 2006 19:00, Simon Peyton-Jones wrote: ... OK so you are talking of taking a copy of the BN code, changing the function names, and statically linking its allocator to GHC's RTS. If someone wants to make a program that uses BN for some other purpose, they get the original BN, whose function names don't clash, and which uses malloc. Sounds ok to me. Simon ... Two problems seem to be lurking here: The first is that, although taking a copy of some existing library and massaging it to become working under some presently available circumstances may be fine, what is really needed is something that keeps on working, even under changing and future circumstances. The second is that if I wish to mix Haskell and non-Haskell code that processes the same data, I may find myself in need of a way to convert between whatever is required by this copy of some existing library that has been created and the presently available, but potentially quite different, version of that same library. No offence is meant here: I know that I am simply re-iterating what I have said earlier. However, I believe that this is sufficiently important to risk being ridiculed for pointing out the obvious. Then you ask ... If you have the time, would you be able to find any programs you might have that displayed poor Integer performance or possibly bottlenecks in the current system? Let me say first that I have not experienced any really surprisingly poor performance of the current system. To be sure, I have seen programs similar to my own and not written in Haskell that performed better on comparable tasks. But I have not been able to say specifically that this poorer performance of my program was caused by some inadequacy in the Haskell (with GMP) implementtion. I better be specific here: What I do is integer factorization (splitting integers into their prime factors). And, for example, I have implemented a version of the ECM (Elliptic Curve Method) in Haskell. Having done that (independently and mainly to learn what it was about), I eventually compared it in performance to GMP-ECM which Paul Zimmermann wrote. And I was initially disappointed to find that GMP-ECM was about 7 times faster than my ECM code
RE: Re[2]: Replacement for GMP: Update
On 21 August 2006 16:20, Bulat Ziganshin wrote: Hello Simon, Monday, August 21, 2006, 6:55:47 PM, you wrote: Parallel major GC was worked on by Roshan James during his internship here this summer, and we have some working code, but it needs a lot more testing and tuning, and it may be some time before we can incorporate this into the mainline. so it seems that it will not be included in 6.6.* ? Unfortunately, no. I hope to work on it after the 6.6 release. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Bulat Ziganshin wrote: Hello skaller, Sunday, August 13, 2006, 4:34:14 AM, you wrote: But the state of the art is then two stages behind the requirement: Haskell still has to 'world stop' threads to do a major collection. is not exactly true. look at Non-stop Haskell (http://www.haskell.org/~simonmar/papers/nonstop.pdf) i don't know why it is not included in 6.6 or previous version Simply because it adds overhead (both in runtime and code complexity), and the benefits are relatively modest. To comment on another point in this thread: currently the generation 0 GCs (minor GCs) also stop-the-world, even on a multiprocessor. This is a significant problem, and we have some ideas for fixing it, but no code yet (it's a possible intern project, if anyone is interested). Parallel major GC was worked on by Roshan James during his internship here this summer, and we have some working code, but it needs a lot more testing and tuning, and it may be some time before we can incorporate this into the mainline. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re[2]: Replacement for GMP: Update
Hello Simon, Monday, August 21, 2006, 6:55:47 PM, you wrote: is not exactly true. look at Non-stop Haskell Simply because it adds overhead (both in runtime and code complexity), and the benefits are relatively modest. i think that GC that don't stops the world is just a different product. even if it makes program, say, 2 times slower overall. just imagine game like Quake written in GHC ;) of course, i understand that you don't want to support one more GC architecture, just mentioned why it may matter for some people Parallel major GC was worked on by Roshan James during his internship here this summer, and we have some working code, but it needs a lot more testing and tuning, and it may be some time before we can incorporate this into the mainline. so it seems that it will not be included in 6.6.* ? -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re[8]: Replacement for GMP: Update
Hello skaller, Sunday, August 13, 2006, 4:34:14 AM, you wrote: I know very little about Haskell, let alone GHC internals me too. so it's better to wait for comments about your thoughts from GHC team than from me. but at least i can said that But the state of the art is then two stages behind the requirement: Haskell still has to 'world stop' threads to do a major collection. is not exactly true. look at Non-stop Haskell (http://www.haskell.org/~simonmar/papers/nonstop.pdf) i don't know why it is not included in 6.6 or previous version So I'm bringing into question whether these nice 'optimisations' are actually worthwhile. They actually seem to *degrade* performance, not improve it, when we're running with a large number of CPUs. Stopping the world if you have 20,000 CPU's will happen so often, all the CPU's will be idle 99.99% of the time :) btw, one GHC intern worked on multi-processor GC and i hope that it will be included in 6.6. so, the GC will also use all these 20k cpus :) or Intel/IBM/Sun will start make some FP chips. they already do this actually, just these chips still emulate x86/sparc/... ones :) -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
* Peter Tanski: Quite right; my mistake: under the OpenSSL license a developer cannot mention features of the software in advertising materials, so the license grant of the GPL-OpenSSL program to the developer is void. The reason I mentioned users only was that in the particular problem we have here GHC does not use any other GPL programs (I think I am correct--readline is the unix version, not the GPL version, correct?) On most systems, readline is GPLed. There is a non-copyleft reimplementation somewhere, but I don't think it's widely used. The advertising requirement in the OpenSSL license would certainly constitute a further restriction under GPLv2 section 6; the strange implication is that the no further restriction clause is so broad It has to be very broad, otherwise developers could bypass the copyleft concept. the same clause (verbatim) in section 10 of the LGPL means the GPL license is incompatible with the terms of the LGPL! The LGPL permits relicensing of covered works under the GPL. Without that provision, it would indeed be incompatible with the GPL. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re[4]: Replacement for GMP: Update
Hello skaller, Saturday, August 12, 2006, 7:06:15 AM, you wrote: My point here is that actually this is a disastrous optimisation in a multi-processing environment, because in general, the assignment of a pointer means the store isn't write once. :) all the variables rewritten is local to the function. _without_ tail call optimization, we create new stack frame for each recursive call. _with_ optimization, we just update vars in the first stack frame created because we know that these vars will not be reused after return from call I haven't figured all the details. I am imagining a generational copying collector with a separate young heap for every thread, whilst the great grand mother heap is collected in the background. it's close to the GHC scheme - 2-generational GC with 1st generation local to thread and having default size of 256kb (tied to the size of L2 caches of modern CPUs) while 2nd generation is global for all threads and collected at moments when memory currently allocated from OS exhausted. but currently 2nd-generation GCs don't runs in background and can be as long as several seconds. it's definitely hard to write games (or even web servers) with such architecture! :) so memory allocation is very fast. 1st-level GC is also fast because it runs inside L2 cache. access to local variables (i.e. data structures allocated in near past) is fast because they are all inside L2 cache. although Haskell don't supports auto variables which don't need GC and freed automatically, this scheme allows to significantly lower cost of managing short-lived data structures -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[4]: Replacement for GMP: Update
On Sat, 2006-08-12 at 10:58 +0400, Bulat Ziganshin wrote: Hello skaller, Saturday, August 12, 2006, 7:06:15 AM, you wrote: My point here is that actually this is a disastrous optimisation in a multi-processing environment, because in general, the assignment of a pointer means the store isn't write once. :) all the variables rewritten is local to the function. _without_ tail call optimization, we create new stack frame for each recursive call. _with_ optimization, we just update vars in the first stack frame created because we know that these vars will not be reused after return from call Yes, but this defeats the use of the kind of collector I attempted to describe. The problem will occur if the 'stack' is aged: in that case the sweep can miss the mutation and reap a live object. -- John Skaller skaller at users dot sf dot net Felix, successor to C++: http://felix.sf.net ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
On 2006-08-10, Peter Tanski [EMAIL PROTECTED] wrote: Summary: I finally settled on modifying OpenSSL, since that would be the easiest to work with under GHC's hood (plain C code, not C++). I have to: Have you carefully investigated the OpenSSL license? We in Debian have had repeated problems since the OpenSSL license is, as written, incompatible with the GPL (even linking to OpenSSL is incompatible with the GPL). I would hate to have a situation where all GHC-compiled programs can't be under the GPL. -- John ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re[6]: Replacement for GMP: Update
Hello skaller, Saturday, August 12, 2006, 12:34:54 PM, you wrote: My point here is that actually this is a disastrous optimisation in a multi-processing environment, because in general, the assignment of a pointer means the store isn't write once. :) all the variables rewritten is local to the function. _without_ tail call optimization, we create new stack frame for each recursive call. _with_ optimization, we just update vars in the first stack frame created because we know that these vars will not be reused after return from call Yes, but this defeats the use of the kind of collector I attempted to describe. The problem will occur if the 'stack' is aged: in that case the sweep can miss the mutation and reap a live object. well, i don't know how updatable vars in stack interoperate with GC. let's someone else answer this question. but afaik GHC 6.5 supports multi-cpu execution of multiple Haskell threads (which makes it a better language for modern multi-core cpus), tail-call optimization and 2-generation GC (actually, it supports any number of generations). also, GHC supports mutable arrays. you search GHC bug tickets for one that optimized GC with mutable vars: in ghc64 _each_ GC, including minor ones need to scan _every_ updatable reference (IORef/STRef) and _each_ element of _every_ updatable array (IOArray/STArray) and this significantly slowdowns many programs, including GHC itself. in ghc65 better scheme now used -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: OpenSSL License (was Replacement for GMP: Update)
John, Have you carefully investigated the OpenSSL license? We in Debian have had repeated problems since the OpenSSL license is, as written, incompatible with the GPL (even linking to OpenSSL is incompatible with the GPL). I would hate to have a situation where all GHC-compiled programs can't be under the GPL. I have been discussing this very issue with several people in this list. At first I thought it was a peripheral issue because I had not carefully reviewed the GPL license until Florian Weimer pointed out what the problem was. By that time I had already decided to work with another replacement library--I have not yet decided which because I am still testing things out--based on good arguments other people gave for not subjecting users to OpenSSL's advertising clause. The libraries I am working with are either public domain or BSD licenses. Sorry to scare you. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Replacement for GMP: Update
On Fri, 2006-08-11 at 09:34 +0400, Bulat Ziganshin wrote: why you say that ForeignPtr is slow? afaik, malloc/free is slow, but starting from ghc 6.6 speed of _using_ ForeignPtr is the same as for Ptr Er, malloc/free is not slow .. its very fast. I implemented an arena based allocator (one which just increments a pointer) and it was slower than malloc .. ok so my implementation was naive, but still, that does make malloc pretty good. After all on the average call where an object of that size is free already it is a single array lookup, we have: (a) fetch pointer (one read) (b) fetch next (one read) (c) store next as current (one write) It is very hard to beat that. Indeed, the whole cost of this technique is probably in the mutex based locking that wraps it on a multi-processor. A purely functional system -- one which does NOT convert self tail calls into jumps and reuse storage -- can perhaps be faster, since each thread can have its own local arena to allocate from (without need of any write barrier) .. however it isn't clear to me what the trade off is between cache misses and parallelism. Doesn't Haskell do that optimisation? It is of course hard to test this today without a fairly expensive box with enough CPUs on the one bus. -- John Skaller skaller at users dot sf dot net Felix, successor to C++: http://felix.sf.net ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: Re[2]: Replacement for GMP: Update
The issue isn't that malloc is slow, but rather that we don't know when to call 'free'. To make sure that we do call free, we must make GHC's garbage collector remember when it drops the reference to the object, and then call free. That is what Peter refers to as the ForeignPtr solution. It's gotten a lot cheaper in GHC 6.6 than it used to be, so it's worth trying. A merit of the approach is that is avoids fiddling with the bignum allocator at all. Simon | -Original Message- | From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] | On Behalf Of skaller | Sent: 11 August 2006 09:21 | To: Bulat Ziganshin | Cc: Peter Tanski; glasgow-haskell-users@haskell.org | Subject: Re: Re[2]: Replacement for GMP: Update | | On Fri, 2006-08-11 at 09:34 +0400, Bulat Ziganshin wrote: | | why you say that ForeignPtr is slow? afaik, malloc/free is slow, but | starting from ghc 6.6 speed of _using_ ForeignPtr is the same as for Ptr | | Er, malloc/free is not slow .. its very fast. I implemented | an arena based allocator (one which just increments a pointer) | and it was slower than malloc .. ok so my implementation was | naive, but still, that does make malloc pretty good. | | After all on the average call where an object of that | size is free already it is a single array lookup, we have: | | (a) fetch pointer (one read) | (b) fetch next (one read) | (c) store next as current (one write) | | It is very hard to beat that. Indeed, the whole cost | of this technique is probably in the mutex | based locking that wraps it on a multi-processor. | | A purely functional system -- one which does NOT convert | self tail calls into jumps and reuse storage -- can | perhaps be faster, since each thread can have its own | local arena to allocate from (without need of any write | barrier) .. however it isn't clear to me what the trade | off is between cache misses and parallelism. | | Doesn't Haskell do that optimisation? | | It is of course hard to test this today without a | fairly expensive box with enough CPUs on the one bus. | | -- | John Skaller skaller at users dot sf dot net | Felix, successor to C++: http://felix.sf.net | | ___ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Einar Karttunen wrote: On 10.08 11:16, Peter Tanski wrote: Paragraph 6 of the OpenSSL (1998-2005) license states that: * 6. Redistributions of any form whatsoever must retain the following *acknowledgment: *This product includes software developed by the OpenSSL Project *for use in the OpenSSL Toolkit (http://www.openssl.org/). All developers would have to do is include the acknowledgment stated above. I think this is not bad for specific applications, but forcing this upon all code compiled by GHC would be bad. I think the compiler should not link applications by default to things that force license related things. Most packages (including the base package) already force you to do license-related things when you distribute binaries: - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. The OpenSSL license is no worse in this respect. Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
* Peter Tanski: On the other hand, the OpenSSL FAQ at http://www.openssl.org/support/ faq.html#LEGAL2 mentions that some GPL programs do not allow binary combination (static linking) or interoperation (dynamic linking) with OpenSSL. Honestly I have not seen any GPL licenses like this. The GNU GPL version 2, at http://www.gnu.org/licenses/ gpl.html, does not mention OpenSSL, nor does version 1, nor does the GNU LGPL, at http://www.gnu.org/licenses/lgpl.html. This is the offending part: * 3. All advertising materials mentioning features or use of this software *must display the following acknowledgement: *This product includes cryptographic software written by * Eric Young ([EMAIL PROTECTED]) *The word 'cryptographic' can be left out if the rouines from the library *being used are not cryptographic related :-). It's generally believed that this is a further restriction in the sense of section 6 of the GPL (version 2). In any case, I think it would be more of a restriction to someone *using* the OpenSSL program, not a developer. It's a problem for a developer who wants to use a GPLed library written by someone else, too. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Florian, This is the offending part: * 3. All advertising materials mentioning features or use of this software *must display the following acknowledgement: *This product includes cryptographic software written by * Eric Young ([EMAIL PROTECTED]) *The word 'cryptographic' can be left out if the rouines from the library *being used are not cryptographic related :-). It's generally believed that this is a further restriction in the sense of section 6 of the GPL (version 2). In any case, I think it would be more of a restriction to someone *using* the OpenSSL program, not a developer. It's a problem for a developer who wants to use a GPLed library written by someone else, too. Quite right; my mistake: under the OpenSSL license a developer cannot mention features of the software in advertising materials, so the license grant of the GPL-OpenSSL program to the developer is void. The reason I mentioned users only was that in the particular problem we have here GHC does not use any other GPL programs (I think I am correct--readline is the unix version, not the GPL version, correct?) so until the developer compiles a Haskell program with GHC (with OpenSSL) *and* that program uses a GPL program, the Haskell developer is still able transfer a valid license to users. The way the OpenSSL FAQ stated the problem, the implication was that there was specific mention of OpenSSL in a GPL license. The advertising requirement in the OpenSSL license would certainly constitute a further restriction under GPLv2 section 6; the strange implication is that the no further restriction clause is so broad the same clause (verbatim) in section 10 of the LGPL means the GPL license is incompatible with the terms of the LGPL! It's all very touchy. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Florian Weimer wrote: This is the offending part: * 3. All advertising materials mentioning features or use of this software *must display the following acknowledgement: *This product includes cryptographic software written by * Eric Young ([EMAIL PROTECTED]) *The word 'cryptographic' can be left out if the rouines from the library *being used are not cryptographic related :-). I would find this a real problem. Not because I wouldn't gladly acknowledge contributions in the text of the license to be distributed with the binary, or in the Help-About dialog box in the program, but because I wouldn't want to have to list contributions in all advertising materials eg an advertisement for one's program in a magazine etc - it would just make the product look amateurish as well as giving undue importance to just one of the many people whose work it was making use of. Also, what if your company made a T-shirt with a character from a computer game you're developing using Haskell - such legal smallprint would totally spoil the effect... ;-) Therefore I'd recommend that licenses for code used by GHC runtime should be either BSD or public domain. If the FFI was used for bignum then (talking about Windows OS for the moment) the bignum implementation could just be supplied as a C DLL, perhaps even several different C DLL's for people to choose which one they wanted to distribute with their program based on speed vs licencing issues. Eg if GMP was in a DLL then it would be sufficient to just supply gmp.dll + the gmp LGPL as two files along with the app binary and licensing issues would disappear afaiu. Another advantage of this direction would be that any slowness in the FFI would have to be ironed out, leading to a faster FFI which would be good for other things too eg matrix libs, graphics libs etc. Finally, separating bignum out of GHC runtime would make GHC runtime leaner therefore (hopefully)easier to maintain. Of course this depends on whether or not an FFI implementation of bignum would be fast enough compared to directly coding it in the runtime. (If you choose the DLL method I'd recommend incorporating the GHC version number into the name of the DLL to avoid DLL conflicts eg ghc-6.4.2 would use gmp-6.4.2.dll etc) Anyway good luck with whatever way you choose, Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Replacement for GMP: Update
Simon PJ and Bulat, [the] ForeignPtr solution [has] gotten a lot cheaper in GHC 6.6 than it used to be, so it's worth trying. A merit of the approach is that is avoids fiddling with the bignum allocator at all. I actually did not know that until today; I have tried to keep up with the rapid changes going on but until Simon Marlow posted the FFI syntax patch on cvs-ghc-request I had not read into it that much. It won't be too much trouble for me to do a bare FFI binding to GMP or another library (people seem to be having problems with OpenSSL's license) but what I have been doing still applies: writing bitwise operators and cleaning things up. I don't know how much the indirection of a FFI binding would degrade the speed compared to a direct C-- binding (you have an extra function call with FFI); it should not be any more costly than indirection through two pointers. This will be quick: I will set up a GMP-FFI binding as a speed- reference, for starters. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Brian, Therefore I'd recommend that licenses for code used by GHC runtime should be either BSD or public domain. I agree. I was working on a rewrite of OpenSSL's BN from scratch-- maybe a rewrite of GMP would be better--but that is a huge undertaking for no other reason than these are big, complex projects. I will have to test Crypto++ and Botan to see if they are comparable in a Haskell context (both are written in C++; Crypto++ is essentially public domain while Botan has a BSD2 license--reproduce copyright in distributions of binary and source code). I will have to write least-common-multiple, bitwise operators and conversions to and from floating point representations. If the FFI was used for bignum then (talking about Windows OS for the moment) the bignum implementation could just be supplied as a C DLL, perhaps even several different C DLL's for people to choose which one they wanted to distribute with their program based on speed vs licencing issues. Eg if GMP was in a DLL then it would be sufficient to just supply gmp.dll + the gmp LGPL as two files along with the app binary and licensing issues would disappear afaiu. Another advantage of this direction would be that any slowness in the FFI would have to be ironed out, leading to a faster FFI which would be good for other things too eg matrix libs, graphics libs etc. Finally, separating bignum out of GHC runtime would make GHC runtime leaner therefore (hopefully)easier to maintain. I am testing two versions of GMP against the current internal version: one using FFI and another with the ForeignPtrs written in C--. If either is comparable to the internal version that is definitely a preferable solution for flexibility. I have to be very picky, though: Simon Marlow, Simon Peyton-Jones and the rest of the GHC Team are primarily interested in performance and the integrity of the RTS (no one would be happy if the RTS broke for bad FFI calls). Thanks for the encouragement. Best regards, Peter Tanski ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Replacement for GMP: Update
John, After all on the average call where an object of that size is free already it is a single array lookup, we have: (a) fetch pointer (one read) (b) fetch next (one read) (c) store next as current (one write) This is true for memory access; it is not true for memory allocation. I do not know how malloc allocates memory on Windows but on general POSIX systems the kernel uses a linked list and lots of other management things to reduce fragmentation, such KMEMSTAT. Malloc may also block, which is something that you have more control over in your own garbage collected heap. A really good explanation for the problem of rapidly allocating and deallocating temporary blocks of memory under 35kb is here: http://ridiculousfish.com/blog/ archives/2006/05/16/36/ . In any case, Simon Marlow had previously mentioned that alloc (from GHC's heap) is faster than malloc. He is almost certainly correct, although I hope the difference will not be that great and the only thing I have to worry about is ForeignPtr. We shall see whether malloc-memory makes a difference in the benchmarks. A purely functional system -- one which does NOT convert self tail calls into jumps and reuse storage -- can perhaps be faster, since each thread can have its own local arena to allocate from (without need of any write barrier) .. however it isn't clear to me what the trade off is between cache misses and parallelism. That is interesting but I do not understand whether your mention of self tail calls turned into jumps was low or high level. From the context it seems as if you are talking about a high level implementation; each function running in a separate thread. GHC's RTS does use many separate threads (the RTS is threaded by default for the latest version, 6.6). As for turning self tail calls into jumps at the low level, GHC does do this through C-- (the GHC implementation of C-- is called Cmm). I believe that is both faster and more memory efficient than a high level threaded system. Philosophically speaking, even if Simon Peyton-Jones developed Cmm to solve the problem of efficient functional computations, Cmm has turned the research of Haskell from research on a computer language to research on a system of computation. (Maybe that is what he meant when some time ago he wrote John Meacham and said that they (the GHC researchers) considered compiling via C a dead end.) A language can be anything: all it requires is a means of being translated into machine code; pseudo-intelligent and advanced compiler systems such as gcc, JHC for Haskell or OCaml for the Caml version of ML, may translate programming languages into machine code but the underlying computation remains largely sequential. The curious thing about GHC- Haskell is that through the prism of Cmm, which enforces such things as immutable variables and recursion right at the machine level, Haskell is less a language of translation to sequential machine code and more a description of a computational model. If you still think I am wrong about this, consider the possibility that Haskell with Cmm is a modern research project in the same concept that motivated Lisp: a different model of computation. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Replacement for GMP: Update
On Fri, 2006-08-11 at 18:56 -0400, Peter Tanski wrote: John, After all on the average call where an object of that size is free already it is a single array lookup, we have: (a) fetch pointer (one read) (b) fetch next (one read) (c) store next as current (one write) This is true for memory access; it is not true for memory allocation. That's an *allocation* obtained by popping a suitable block off a linked list (sorry should have explained better) I do not know how malloc allocates memory on Windows but on general POSIX systems the kernel uses a linked list and lots of other management things to reduce fragmentation, such KMEMSTAT. Yes, but most allocations don't require a kernel call: malloc is implemented in libc and suballocates an block obtained from the kernel. Free'd blocks go on a free list indexed by block size .. so the typical program will get to a state where most allocations reduce to popping a list. Malloc may also block, which is something that you have more control over in your own garbage collected heap. Yes. A really good explanation for the problem of rapidly allocating and deallocating temporary blocks of memory under 35kb is here: http://ridiculousfish.com/blog/ archives/2006/05/16/36/ . Eh? That explains why OSX is slower with 35K blocks because it switches to kernel calls for larger allocations at that size .. the article doesn't say anything about smaller blocks. I assume we're considering small blocks here: list nodes and so on are small, and these would be the majority case in a functional programming language. A purely functional system -- one which does NOT convert self tail calls into jumps and reuse storage -- can perhaps be faster, since each thread can have its own local arena to allocate from (without need of any write barrier) .. however it isn't clear to me what the trade off is between cache misses and parallelism. That is interesting but I do not understand whether your mention of self tail calls turned into jumps was low or high level. What I mean is the optimisation (which Felix does for functional code) whereby you convert: int f(int x) { int y = x + x; if(x100) return x; return f(y); } to int f(int x) { start: int y = x + x; if (x100) return x; x = y; // assign argument goto start; } This optimisation eliminates the need to spawn a new stack frame, and will eliminate cache misses. So it looks very efficient. My point here is that actually this is a disastrous optimisation in a multi-processing environment, because in general, the assignment of a pointer means the store isn't write once. With write once store -- ignoring registers -- you can collect lazily in a background thread. Although you're not scanning all of memory .. new memory is allocated as time goes on .. it doesn't matter. Any garbage you find is certain to be garbage. This is because you cannot miss a heap block in your sweep. Even though new blocks are being allocated in parallel, they can only obtain a pointer from the old picture of the heap (implying the block is reachable), and, the pointer in the old block from which it was obtained cannot be erased: since constructed blocks are read only. I haven't figured all the details. I am imagining a generational copying collector with a separate young heap for every thread, whilst the great grand mother heap is collected in the background. This means (a) mother collection, (b) young heap compaction both work on a per thread basis independently of other threads and without any synchronisation or locking. What requires synchronisation is aging. Moving objects out of the child heaps into a parent needs to be serialised, and the start of a new sweep of the parent heap must twiddle a lock or something to ensure it has a coherent snapshot of the heap at some point of this serialisation (however the picture of the heap only needs to be synchronised once). for example if you have a linked list of the mother heap blocks, you have to serialise pushing stuff onto it during ageing .. and you have to get a valid pointer to start the sweep from. The point is, this scheme reduces the need for serialisation to the infrequent need to age objects or start a new sweep of the mother heap: the actual sweep, raw allocations, and even young heap compaction, all occur in separate threads and don't interfere with each other. Thus, it seems a lot more scalable to a multi-processing environment to actually have LOW LEVEL purely functional paradigm: no store can be written more than once. More precisely, pointers are initialised to NULL, and be assigned a non-NULL heap pointer once, but non-NULL pointer can never be erased. Which means: converting tail calls to a loop and using mutable store is a nono: it is faster on a single CPU but will
RE: Replacement for GMP: Update
Sounds good! Remember that the memory-allocation mechanism is crucial. How does BN do that? | (3) I have been looking at how to implement a dual-constructor-in-a- | pointer for Integer (i.e., merge constructors of small Integers and | big Integers into the Int#). Would that solution be workable or | might it break current Haskell programs? Just a thought. Making a single word contain either a pointer or a non-pointer (depending on the setting of some bit) would have some quite global implications, as would losing 32 bit precision from Int#. I suggest that you do not couple these two projects together! Do the GMP thing first, then investigate this later. We have quite a few bit-twidding ideas that we have never followed up. Simon | -Original Message- | From: Peter Tanski [mailto:[EMAIL PROTECTED] | Sent: 10 August 2006 06:32 | To: Simon Peyton-Jones; Simon Marlow; Esa Ilari Vuokko; John Meacham | Cc: glasgow-haskell-users@haskell.org | Subject: Replacement for GMP: Update | | Simon PJ, Simon, Esa and John, | | Here is an update on what I have been doing so far in making a grand | attempt to replace GMP. | | (1) evaluate replacement libraries | LibTomMath: | Pros- | * has all the operators GMP offered | Cons- | * slow; about half as fast as OpenSSL in my tests for simple | mathematical operations, much slower if you account for time to | write or resize memory. (There is another MP library, which | LibTomMath is based on, that is also very slow--student work) | * not even reentrant; needs significant modification | * beta-release; needs a bit of work to get it to production level | * configuration needs to be written (not really portable, messy) | ARPREC: | Pros- | * very fast (essentially does everything through the | Floating Point Unit of a CPU) | * complex mathematical operations | * very precise | * already thread safe (through C++ thread-safe statics) | Cons- | * no bitwise operations (not even left and right-shifts) | * difficult configuration (everything runs by setting a precision | level; | (precision level ~= number of words (doubles) in array) | it does not automatically resize memory; conversion from | MP Real to Integer relies specially on careful precision-level) | * memory inefficient (underestimates the number of real digits you | can fit into a double, i.e., a 64-bit double has 48 bits of | precision, | holding about 9.6 digits per byte, resulting in an 848-byte array | to hold an MP number with 1000 digits). | OpenSSL: | Crypto++ (http://www.eskimo.com/~weidai/cryptlib.html): | Botan (http://botan.randombit.net/): | Pros- | * all of these are fast, since all use Integers to support | cryptography; | (Crypto++ and Botan are C++ crypto-libraries, all licenses good) | * all of these provide most basic mathematical operations | Cons- | * none of these provide , |, ^(xor) bitwise operators | * Botan has least mathematical operators of the three | * none provide lcm operator | * all would realistically have to have the Integer libraries stripped | out of the distribution and repackaged for GHC | | Summary: I finally settled on modifying OpenSSL, since that would be | the easiest to work with under GHC's hood (plain C code, not C++). I | have to: | a. make OpenSSL's BN thread-safe (add thread local storage, at least) | b. optimally add configure-conditional parallelism to BN (through PVM) | c. add , |, ^, lcm and a few other operations | d. separate the BN from the rest of OpenSSL and rename the symbols | to avoid conflicts (necessary because I have to modify the library | anyway) | | (2) work on GHC: | * finally understand C--; know what I need to modify | * work through Makefiles: touch and go; I haven't mapped out all the | variable settings from configure.in on down when it comes to DLLs | Comment: | for the Makefile in ghc/rts, in lines 300-346, | GC_HC_OPTS += -optc-O3 | --isn't this problematic? gcc, from -O2 on includes -fgcse which may | *reduce* runtime performance in programs using computed gotos; | -fgcse is actually run twice, because -frerun-cse-after-loop is also | set at -O2. Would it be better to pass individual flags, such as | -funroll-loops and -falign-loops=16 (ppc, Intel setting)? | | (3) I have been looking at how to implement
RE: Replacement for GMP: Update
On 10 August 2006 06:32, Peter Tanski wrote: for the Makefile in ghc/rts, in lines 300-346, GC_HC_OPTS += -optc-O3 --isn't this problematic? gcc, from -O2 on includes -fgcse which may *reduce* runtime performance in programs using computed gotos; -fgcse is actually run twice, because -frerun-cse-after-loop is also set at -O2. Would it be better to pass individual flags, such as -funroll-loops and -falign-loops=16 (ppc, Intel setting)? Possibly, yes. IIRC, -O3 was mainly to get some loop unrolling. This is a performance-critical part of the system though, and when making any changes we like to measure things to make sure we haven't pessimised performance. (3) I have been looking at how to implement a dual-constructor-in-a- pointer for Integer (i.e., merge constructors of small Integers and big Integers into the Int#). Would that solution be workable or might it break current Haskell programs? Just a thought. Which representation in particular are you talking about here? Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Einar, In my previous email I wrote something potentially confusing (really a typo): For developers (commercial or open source), the OpenSSL license only mentions redistribution of the OpenSSL code in binary form (paragraph 2). In this context binary form means the complete program binary, not partial binary as with statically linking to a library, so developers of GHC programs would *not* have to include the whole OpenSSLSSLeay license in their source code. I meant in their code, not source code, source or binary. I hope that helps. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
On 10.08 11:16, Peter Tanski wrote: Paragraph 6 of the OpenSSL (1998-2005) license states that: * 6. Redistributions of any form whatsoever must retain the following *acknowledgment: *This product includes software developed by the OpenSSL Project *for use in the OpenSSL Toolkit (http://www.openssl.org/). All developers would have to do is include the acknowledgment stated above. I think this is not bad for specific applications, but forcing this upon all code compiled by GHC would be bad. I think the compiler should not link applications by default to things that force license related things. I think this is one reason GMP is being replaced. ps. personally I don't think the advertising clause is bad, but I think it is bad to force it on other users. - Einar Karttunen ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Einar, *This product includes software developed by the OpenSSL Project *for use in the OpenSSL Toolkit (http://www.openssl.org/). All developers would have to do is include the acknowledgment stated above. I think this is not bad for specific applications, but forcing this upon all code compiled by GHC would be bad. I think the compiler should not link applications by default to things that force license related things. I think this is one reason GMP is being replaced. ps. personally I don't think the advertising clause is bad, but I think it is bad to force it on other users. You may be right. The licensing problem with GHC, as I understood it, is summed up at http://hackage.haskell.org/trac/ghc/wiki/ ReplacingGMPNotes. LGPL is very restrictive. As I have been working on separating BN out of the main OpenSSL distribution, renaming symbols and generally reforming it into a custom, stand-alone library for GHC I could take it one step further and implement it from scratch as a GHC library. Implementing the BN library from scratch may take some time but I will give it a shot and see if I can't get better benchmarks. The downside is that I would have more incentive to remove some Cryptography-based cruft, such as BN_nnmod, BN_mod_add, BN_mod_sub and the BN-random routines, as these are unnecessary for Prelude and GHC. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
There's one thing I don't entirely understand about the GMP problem. I understand that there are some limitations on GHC's ability to generate relocatable (and therefore dynamically linkable) code on x86 (a register allocation problem related to the mangler if I recall the comments in the code correctly). But this shouldn't prohibit linking GMP in dynamically, should it? It's just a C library and GCC should happily generate relocatable code. As a dynamically linked library, there should be no tainting issues to worry about even if the dynamically linked code is shipped with the executable.Am I missing something?On Aug 10, 2006, at 5:00 PM, Peter Tanski wrote:Thorkil, I would like to mention a few things that I have not seen discussed: Clearly,using an existing library unmodified would be preferable: New developments,error corrections, documentation, wide exposure, all of these things would beavailable. Agreed. Unfortunately for us it seems that, apart from GMP, none of the fully developed and fast libraries available have all the functionality Haskell requires: ARPREC (and its predecessor, MPFUN) lack all binary operators (AND, OR, XOR, left-shift, right-shift, bit-flip, word-flip); OpenSSL's BN library lacks AND, OR, XOR and the least common multiple operation (lcm), as do the libraries in Botan and Crypto++. I did not mention this before, but they also lack conversion functions to and from floating point representations. So it would not be possible to use these in GHC and still maintain the same functionality without writing those transformation functions either separately (slow) or in the libraries themselves (modification).There are libraries that have the functionality we require: MIRACL (at http://indigo.ie/~mscott/) is free for educational use but requires a premium for commercial use; LibTomMath is essentially beta (version 0.38), slow (less than half as fast as OpenSSL, which is equivalent in speed to GMP) and from my experience when putting it together, a little hacked (after all, it is a beta version); MPI (from http://www.cs.dartmouth.edu/~sting/mpi/) is a good library and has all the math functions but lacks the binary operations and has not been updated since 2003; maybe it has not required updating (it was originally written in 1998). I have read that MPI is not as fast as LibTomMath (I will do a real comparison if anyone is interested). I do know from timing tests I did run in the MPI distribution that MPI tends to operate better on small integers (128-256 bits). LibTomMath and MPI are both essentially public domain. I have heard that LibTomMath is used as the Tcl bignum library; I do not know that library supports Perl's bignum (Math::BigInt) but I could look into it.Bignum libraries are actually fairly widely available but few are portable across a wide variety of domains: Microsoft's .NET bignum library and Apple's vecLib BigNum library, for example. (Apple's vecLib vBigNum is also limited since it offers big integers only at 256, 512 and 1024 bits and it does not provide all the mathematical operations we require.)You may have noticed from reading the discussions so far that an internal library *is* a possibility. I have looked briefly at the OpenSSL Bignum library and in the areas of memorymanagement, but also error handling, it seems clearly intertwined to someextent with OpenSSL in ways which would appear to rule out the direct use ofthis library for GHC Integers. OpenSSL's BN library is primarily tuned to support cryptography, particularly the generation of very large primes for public key cryptosystems. It is possible to separate the BN library out (I am having some success there already). It is also possible to use the library separately from Haskell using ForeignPtr; essentially doing everything through Haskell's FFI. I have honestly not benchmarked a FFI-ForeignPtr interface against the current internal-GMP implementation, partly because the overhead required to use ForeignPtr and the availability of garbage-collected memory for GMP indicate that an internal GHC Bignum library would clearly be faster. Many people have given good arguments for using an FFI-interface to such a library (for one, GMP supports dll's and posix dynamic libraries, which would eliminate the licensing issue), so I think before I go on I will set up a benchmark to try the two out. The one thing I cannot do without taking GMP out of the GHC compiler is a side-to-side comparison of GMP-FFI and GMP-internal because GMP can use only one memory allocator at a time: either GHC's runtime system Garbage Collector or malloc. However, considering the advantages of usingan existing library unchanged, we might consider another possibility: Workingwith the OpenSSL people to modify their library to allow the sort ofinterfacing that is needed for its direct and efficient use in GHC. While, ofcourse, retaining its value as part of OpenSSL. I haven't looked at that for reasons of time--perhaps this is my
Re: Replacement for GMP: Update
Reilly Hayes on 2006-08-10 18:36:49 -0700: There's one thing I don't entirely understand about the GMP problem. I understand that there are some limitations on GHC's ability to generate relocatable (and therefore dynamically linkable) code on x86 (a register allocation problem related to the mangler if I recall the comments in the code correctly). But this shouldn't prohibit linking GMP in dynamically, should it? It's just a C library and GCC should happily generate relocatable code. As a dynamically linked library, there should be no tainting issues to worry about even if the dynamically linked code is shipped with the executable. Am I missing something? No, the LGPL doesn't mandate source redistribution when you redistribute a binary that is dynamically linked to a LGPL-licensed library. If GHC did support dynamically linking programs, it wouldn't be an issue, but GHC only supports that on OS X. I was wondering something similar - is it really easier to replace the functionality and speed of GMP than to extend GHC's dynamic library support to other platforms? signature.asc Description: Digital signature ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Reilly, ... this shouldn't prohibit linking GMP in dynamically, should it? It's just a C library and GCC should happily generate relocatable code. As a dynamically linked library, there should be no tainting issues to worry about even if the dynamically linked code is shipped with the executable. Am I missing something? Not at all. GHC builds the GMP library only if it is not already available on the system. On my Mac OS X computer it uses a dynamic library. I have not tried using gmp.dll on Windows since I have not built GHC on my Windows computer (it is a bit slow--a 600MHz P3). But the dynamic library form of GMP only solves the licensing problem (admittedly, for my purposes the worst of the bunch). It should be easy to change GMP's build settings so GHC is distributed with a dynamic GMP library. The other problem is that GMP has a mechanism to let the user determine its memory allocator, with the caveat that only one allocator can be used by a single program. GHC configures GMP to use GHC's RTS-GC for allocation so GHC-compiled programs can't interface with GHC separately. (This would not be such a big problem for general programs but C-Haskell cryptographic or scientific programs that might benefit from GMP's additional functionality would suffer.) On a side note, if you have been reading this user-list recently it seems that programmers (including myself, I guess) do not want to have to package a dynamic library (GMP) with programs compiled with GHC--a particularly irksome task if your Haskell program doesn't even *use* Integer. Not only do users have to package the separate dll, they also have to package a notice of the GMP copyright along with the binary. Just today Einar Karttunen mentioned that: *This product includes software developed by the OpenSSL Project *for use in the OpenSSL Toolkit (http://www.openssl.org/). All developers would have to do is include the acknowledgment stated above. ... ps. personally I don't think the advertising clause is bad, but I think it is bad to force it on other users. Einar does have a good point, here. Personally speaking, such packaging and licensing stuff is o.k. for free software but for a clean commercial distribution it would be a bad thing; a reason to choose not to use GHC (or nhc98, for that matter). Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Replacement for GMP: Update
Simon PJ, Simon, Esa and John, Here is an update on what I have been doing so far in making a grand attempt to replace GMP. (1) evaluate replacement libraries LibTomMath: Pros- * has all the operators GMP offered Cons- * slow; about half as fast as OpenSSL in my tests for simple mathematical operations, much slower if you account for time to write or resize memory. (There is another MP library, which LibTomMath is based on, that is also very slow--student work) * not even reentrant; needs significant modification * beta-release; needs a bit of work to get it to production level * configuration needs to be written (not really portable, messy) ARPREC: Pros- * very fast (essentially does everything through the Floating Point Unit of a CPU) * complex mathematical operations * very precise * already thread safe (through C++ thread-safe statics) Cons- * no bitwise operations (not even left and right-shifts) * difficult configuration (everything runs by setting a precision level; (precision level ~= number of words (doubles) in array) it does not automatically resize memory; conversion from MP Real to Integer relies specially on careful precision-level) * memory inefficient (underestimates the number of real digits you can fit into a double, i.e., a 64-bit double has 48 bits of precision, holding about 9.6 digits per byte, resulting in an 848-byte array to hold an MP number with 1000 digits). OpenSSL: Crypto++ (http://www.eskimo.com/~weidai/cryptlib.html): Botan (http://botan.randombit.net/): Pros- * all of these are fast, since all use Integers to support cryptography; (Crypto++ and Botan are C++ crypto-libraries, all licenses good) * all of these provide most basic mathematical operations Cons- * none of these provide , |, ^(xor) bitwise operators * Botan has least mathematical operators of the three * none provide lcm operator * all would realistically have to have the Integer libraries stripped out of the distribution and repackaged for GHC Summary: I finally settled on modifying OpenSSL, since that would be the easiest to work with under GHC's hood (plain C code, not C++). I have to: a. make OpenSSL's BN thread-safe (add thread local storage, at least) b. optimally add configure-conditional parallelism to BN (through PVM) c. add , |, ^, lcm and a few other operations d. separate the BN from the rest of OpenSSL and rename the symbols to avoid conflicts (necessary because I have to modify the library anyway) (2) work on GHC: * finally understand C--; know what I need to modify * work through Makefiles: touch and go; I haven't mapped out all the variable settings from configure.in on down when it comes to DLLs Comment: for the Makefile in ghc/rts, in lines 300-346, GC_HC_OPTS += -optc-O3 --isn't this problematic? gcc, from -O2 on includes -fgcse which may *reduce* runtime performance in programs using computed gotos; -fgcse is actually run twice, because -frerun-cse-after-loop is also set at -O2. Would it be better to pass individual flags, such as -funroll-loops and -falign-loops=16 (ppc, Intel setting)? (3) I have been looking at how to implement a dual-constructor-in-a- pointer for Integer (i.e., merge constructors of small Integers and big Integers into the Int#). Would that solution be workable or might it break current Haskell programs? Just a thought. -Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re[2]: Replacement for GMP
Hello Simon, Wednesday, August 2, 2006, 4:05:51 PM, you wrote: (2) We're concerned about performance. Replacing GMP, but losing substantial performance on bignum-intensive programs would be unattractive. don't forget about speed/memory efficiency of any programs that use Integer just for case but really most of their numbers fit in 32/64 bits. i have one particular program of this type - it builds list of all files on disk and Integers are used to save filesizes. i will be glad if, vice versa, memory requirements for small integers will be reduced to the same as for Ints the same binary that also wants to use GMP. (Of course, we could *copy* GMP, changing all the function names. That would eliminate the problem!) isn't it rather easy task for some automated tool? i think that even existing tools may be found -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Replacement for GMP
Hi Bulat, don't forget about speed/memory efficiency of any programs that use Integer just for case but really most of their numbers fit in 32/64 bits. i have one particular program of this type - it builds list of all files on disk and Integers are used to save filesizes. i will be glad if, vice versa, memory requirements for small integers will be reduced to the same as for Ints I was looking at this yesterday, partly because I read your previous discussion in returning to cost of Integer. The low-level solution Lennart mentioned and that you noted is used in OCaml would be fast and convenient for a programmer. That solution would be much more difficult in C--, however, since it requires customisations for different processors or operating systems, especially 32 and 64bit architectures. Following the general trend of consensus, it should be part of the Bignum library; if the library does not have such a test it, as OpenSSL's BN library does not, it would have to be added. (With unmodified OpenSSL, you would have to examine the When the Bignum library returned the value to the RTS, the RTS would only have to check for the tag and store it accordingly. the same binary that also wants to use GMP. (Of course, we could *copy* GMP, changing all the function names. That would eliminate the problem!) isn't it rather easy task for some automated tool? i think that even existing tools may be found I know copyrights are weak compared to patents but I do not think software copyrights are that weak. Just changing the names seems like a cosmetic change and performing the change through an automated system, doubly so. Copyrighted programs--particularly under the GPL license, which also covers the resulting object code--do not lose their copyright protection through name-mangling performed by a preprocessor. I think the lawyers for a company using GHC would probably be worried about it. Best Regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users