Re: FFI Bindings to Libraries using GMP
Hello Benedikt, I apologise for the late reply. I am travelling tomorrow but I will try to get something an alpha implementation out by this Wednesday. For now here are some preliminary answers: On Sep 28, 2007, at 7:41 AM, Benedikt Huber wrote: Am 18.09.2007 um 05:49 schrieb Peter Tanski: The best solution would be to revamp the way Integer types are implemented, so when possible they are mutable under the hood, much like using the binary += instead of the ternary +. Enumerations like the test in [1], below, would not be mutable unless there were some information such as a good consumer function that indicated the intermediate values were only temporarily necessary. I'm not sure if I understand this correctly; Do you want to expose an unsafe/IO interface for destructive Integer manipulation? I would not expose it, just optimise it, in the same way as we can replace recursion with loops at the Cmm level. The end result would involve re-cycling integer memory so you might say that in some equations integers are mutable. (If it is provable that an integer value would not be used again, then it does not seem right not to recycle the memory.) The OpenSSL library is not GPL compatible, so there would be licensing problems for GPL'd system distributions; it is also relatively slow, though it does have a noticeably constant curve for exponential functions. Maybe you should add a note to http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/ PerformanceMeasurements. The statistics suggest that the OpenSSL BN has comparable performance to the GMP, especially for smaller numbers. Some note about the (very confusing) licensing issues regarding OpenSSL would also be nice. I will add this to the wiki. In short, paragraph 10 of the GPL and paragraph 11 of the LGPL--here I may have the paragraphs wrong-- prohibit placing any additional restrictions on your licensees. OpenSSL places an additional restriction on licensees: you cannot use the name 'OpenSSL' with regard to your product, so the OpenSSL license is incompatible with the GPL/LGPL. [1] Simple Performance Test on (ghc-darwin-i386-6.6.1): Malloc is fast but not nearly as fast as the RTS alloc functions; one thing I have not started is integrating the replacement library with GHC, mostly because the replacement library (on par or faster than GMP) uses SIMD functions whenever possible and they require proper alignment. Ok, it's good to know you're already working on integrating a (native) replacement library. It's workable for now but I need to finish Toom3, a basic FFT, and some specialised division operations. I also need to give Thorkil Naur a crack at it. All of this has been on hold because I have been too selfish and perfectionistic to give anyone what I consider a mess and I have been working too many hours to fix it. (This seems to be a common problem of mine; I intend to change that.) I also performed the test with the datatype suggested by John Meacham (using a gmp library with renamed symbols), data FInteger = FInteger Int# (!ForeignPtr Mpz) but it was around 8x slower, maybe due to the ForeignPtr and FFI overhead, or due to missing optimizations in the code. That is quite an interesting result. Are these safe foreign imports? No. Note that `FInteger' above is even faster than the build-in Integer type for small integers (Ints), so I was talking about allocation of gmp integers. I elaborated the test a little, it now shows consistent results I think [1a]; a lot of performance is lost when doing many allocations using malloc, and even more invoking ForeignPtr finalizers. I found the same thing when I tried that; malloc is slow compared to GC-based alloc. The ForeignPtr finalizers do not always run since-- as far as I know--they are only guaranteed to run before RTS shutdown. I'm still interested in sensible solutions to Bug #311, and maybe nevertheless simple switching to standard gmp allocation (either with finalizers or copying limbs when entering/leaving the gmp) via a compile flag would be the right thing for many applications. I'm also looking forward to see the results of the replacement library you're trying to integrate, and those of haskell Integer implementations. The fastest interim solution I can come up with for you would be to use Isaac Dupree's Haskell-based integer library and set up preprocessor defines so you could build ghc (HEAD) from source and use that. Would that be sufficient for now? Cheers, Pete [1a] Integer Allocation Test allocTest :: Int - `some Integral Type T' allocTest iterations = (iterateT iterations INIT) where iterateT 0 v = v iterateT k v = v `seq` iterateT (k-1) (v+STEP) - Small BigNums Allocation Test (INIT = 2^31, STEP = 10^5, k=10^6) Results (utime samples.sort[3..7].average) on darwin-i386 (dualcore): 0.04s destructive-update C implementation
Re: FFI Bindings to Libraries using GMP
On Sep 14, 2007, at 9:14 AM, Benedikt Huber wrote: | I've been struggling using FFI bindings to libraries which rely on the | GNU Mp Bignum library (gmp). It's an issue that bites very few users, but it bites them hard. It's also tricky, but not impossible, to fix. The combination keeps meaning that at GHC HQ we work on things that affect more people. I doubt we can spare effort to design and implement a fix in the near future -- we keep hoping someone else step up and tackle it! Peter Tanski did exactly that (he's the author of the ReplacingGMPNotes above), but he's been very quiet recently. I don't know where he is up to. Perhaps someone else would like to join in? I apologise for being away. The company I work for has been ramping up for a launch and I have been working very long hours (nights and weekends, too). Thank you for the information - I'm also willing to help, though I'm not too familiar with the GHC internals (yet). I do like the idea of optionally linking with a pure-haskell library, but I'm interested in a solution with comparable performance. Commenting solutions to ticket #311: It goes beyond mere familiarity with the internals: the GMP functions are threaded throughout the RTS and the PrimOps files. Of all the primitive operations, they are the most ubiquitous for interfering in other code. The rough list I put on the ReplacingGMP page is a start but the more I work with the RTS the more little things keep turning up. At the least I should update the page. (2) Using the standard allocation functions for the gmp memory managment (maybe as compile flag) as suggested in http:// www.haskell.org/pipermail/glasgow-haskell-users/2006-July/ 010660.html would also resolve ticket #311. In this case at least the dynamic part of gmp integers has to be resized using external allocation functions, and a finalizer (mpz_clear) has to be called when an Integer is garbage collected. It seems that the performance loss by using malloc is significant [1], as lots of allocations and reallocations of very small chunks occur in a functional setting; some kind of (non garbage collected !) memory pool allocation would certainly help. I'm not sure what overhead is associated with calling a finalizer ? The problem of lots of small allocations affects the garbage collector, as well. In the current implementation, each GMP operation calls doYouWantToGC()--I'm sure you have seen the note in PrimOps.cmm, for example--which may act as a stop-world garbage collection. The byte arrays for GMP are also pinned. Compared to this, a FFI implementation using finalizers, which have horrible but practical guarantees on when they are called, would work much better. The best solution would be to revamp the way Integer types are implemented, so when possible they are mutable under the hood, much like using the binary += instead of the ternary +. Enumerations like the test in [1], below, would not be mutable unless there were some information such as a good consumer function that indicated the intermediate values were only temporarily necessary. (3) So when replacing GMP with the BN library of OpenSSL (see http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/ PerformanceMeasurements), it would propably be neccessary to refactor the library, so custom allocation can be used as well. This does not seem too difficult at a first glance though. The OpenSSL library is not GPL compatible, so there would be licensing problems for GPL'd system distributions; it is also relatively slow, though it does have a noticeably constant curve for exponential functions. The one problem you will find with _all_ potential replacement libraries is incompatible behaviour for bitwise functions: they are implemented arithmetically in GMP but logically elsewhere (when they are implemented at all). (Note: if you are looking for the left-shift and right-shift operations in GMP, they are hidden in mpz_mul_2exp and mpz_t_div_q_2exp.) LibTomMath, for example uses pure logical shifts which do not produce correct results. I could go on about many other small differences but the end result is that you would have to do a lot of hacking to get a library that would replace all the functionality GMP provides. That is why I started a replacement from scratch. So I'd like to investigate the second or third option, as far as my knowledge and time permits it. Of course it would be wise to check first if Peter Tanski is already/still working on a GMP replacement. I left off working on it for some time, but things may slow down a little for now so I will (hopefully) have time to package it up. I meant to do that more than a month ago for Thorkil, who has written a multi-precision integer library before and wanted to help. [1] Simple Performance Test on (ghc-darwin-i386-6.6.1): The haskell function (k
Re: 64-bit windows version?
On Jun 26, 2007, at 4:59 AM, Simon Marlow wrote: Peter Tanski wrote: I keep on referring to this as temporary because there are two different builds here: (1) the build using the old mingw-GHC, without option support for CL; and, (2) the build using the new Windows-native GHC. Yes. And what I'm suggesting is the following - what I've been suggesting all along, but we keep getting sidetracked into sub- discussions: - we adapt the current build system to use the native GHC. I really don't think this is hard, and it's way quicker than replacing significant chunks of the build system, as you seem to be suggesting. I don't have to replace large chunks of the system, although I have added several separate makefiles--an mk/msvc_tools.mk and mk/ build.mk.msvc (which configure will copy into mk/build.mk). It is almost done (the current system, I mean)--although I do have one detail question: Clemens Fruhwirth sent a patch to add shared library support for all architectures, i.e., MkDLL - MkDSO (in compiler/main/ DriverPipeline.hs). I haven't seen the patch after I did my last pull, yesterday. So I assume it has not been applied yet. How do you want autoconf to detect the shared library extension and libtool support? AC_PROG_LIBTOOL does not seem to work well on OS X: OS X libtool is Apple, not GNU (it is also a binary, not a driver-script for libltdl); that macro failed the last time I built GMP and I had to make shared libraries manually). This is precient because the default build for Windows should be DLLs but I want the configuration (at least) to mesh with the rest of the system: I wanted to add $ (libext) and $(shlibext); as it is, I vary them by a simple case in *windows), *darwin*) or *) (unix) but this does not seem correct. So the result is a build system that can build a win-native GHC using another win-native GHC, but not necessarily build a win- native GHC using a mingw GHC. I could set it up so configure could detect which GHC is available and build using that GHC (Mingw or Windows-native). (Just add a C- compiler string to 'ghc -v' or 'ghc --version' and grep for it.) I'm not against VS in particular - I'm against duplication. Build systems rot quickly. By all means discuss a wonderful replacement for the current build system -- I'm aware that the current system is far from perfect -- but I'm not at all convinced that it is a necessity for building win-native GHC. VS is not necessary; it is aesthetic and may offer other benefits for those who wish to hack on GHC. It would require many bits of glue code and careful alignment of tasks so the entire build would not be transparent to any but the most experienced VS programmers. It would, however, be much easier to the more casual developer and it may not be as brittle: shallow build settings, compiler settings, source files included, a bureaucratic notion of ways, would be available from a simple point and click. If I have time I will at least do a prototype (base-compiler only) and see if people like it. I could be wrong. If I am wrong, then constructing a convincing argument might be difficult... We'll have to import new hackers who understand VS builds, because none of the current GHC maintainers do! New blood! :) I'm joking--there have been forks of GHC in the past but they generally don't last long because GHC moves too fast and that's because the Architects are still at work. The only convincing argument here would be a prototype that even the GHC maintainers would be able to understand. Certainly doable but it does present a conundrum: for the old GHC (without builtin cl-support) the order for compilation seems to be: compile/link command compile/link flags output source/ object files other flags while for cl running link.exe or link.exe, it is better to put all the files at the end of the command line: compile/link command compile/link flags output other flags source/object files Why is that a conundrum? GHC can invoke CL with the arguments in whatever order it likes. Sorry, but this just seems like a trivial detail to me. Mingw GHC can't do that. I simply added some conditional changes to the rules in mk/suffix.mk. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
On Jun 25, 2007, at 5:19 AM, Simon Marlow wrote: Yes it is easy but now all Makefiles must be changed to use $ (osuf), such as this line in rts/Makefile: 378: %.$(way_)o : %.cmm $(H_FILES), for what will be a (hopefully) temporary Windows build. I bet there are only a few makefiles that explicitly refer to o as the object-file suffix. After poking around I found that my fears were unfounded. Simply pass cl the /TC (or -TC) option--same as the gcc option '-x c'. Object files are also fine since cl assumes any file with an unrecognised suffix is an object file. The environment variables problem is also solved: either have the environment set up automatically by placing a batch-script 'call' to the MS PSDK 'SetEnv.Cmd' before the shell login in msys.bat or start the DOS shell from the MS PSDK shortcut and log into the msys shell manually--or run the whole thing from DOS. Shows how much I know of msys. Passing flags to cl would be best in a command file (at least I have done _that_ before). I don't understand why you see this as a temporary measure. Surely we'll need a way to build GHC again for this platform? Unless you intend to replace the whole build system? (which I strongly recommend *not* doing, at least not yet) I keep on referring to this as temporary because there are two different builds here: (1) the build using the old mingw-GHC, without option support for CL; and, (2) the build using the new Windows-native GHC. You will almost certainly keep mingw-GHC around but users should not have to download a mingw-GHC to build Windows-native from source (they can't start at a stage1 build), so the Windows-native requires a separate setup. That might as well be Windows-native itself, in other words, use VS--it is the quickest and easiest build to put together. I do not suggest CMake because CMake is a sledgehammer when it comes to managing projects and sub-projects: all paths are absolute (you cannot move the source directories around), there is only one major Project in a system--it only really builds 'all', not sub-targets and build variants beyond the buitin Debug, MinSizeRel, Release, etc., have to be custom-added; it would not integrate well with the current $(way) system. If you are heavily against using VS, maybe an Eclipse/Ant-based build would do. I might use Bakefile. It would be much better to have a single build system. I would gladly replace the whole thing for three reasons: (1) it is a source of many build bugs and it makes them much more difficult to track down; and, (2) it seems to be a serious hurdle for anyone who wants to build and hack on GHC--this is true for most other compiler systems that use the autoconf and Make; and, (3) if GHC is ever going to have cross-compilation abilities itself, the current build system must go, while cross-compiling GHC with the current system requires access to the actual host-system hardware. The reasons I don't are: (1) time (parallel to money); (2) I wouldn't undertake such an effort unless we were all pretty sure what you want to change the build system to; (3) an inevitable side-effect of the move would be loss of old (or little-used) build settings, such as GranSim, and a change to the build system would propagate to parallel projects; and, (4) it is a huge project: both the compiler and libraries must change and the change must integrate with the Cabal system. Work on the mingw-make system is progressing fairly well. The reason to make a special VS build are: (1) Windows programmer familiarity; (2) reduction in the number of build bugs; (3) ease of extension or integration with other VS tools, such as .NET; and, (4) speed--VS builds are much faster than Make. I should also add that when building the RTS it is simply much easier to have a build problem reported in VS than search back through Make- output and manually go to the offending line in a source file. The reason not to make a special VS build is you would have to support it--one more thing to check when new source files are added. As I said before, this may be scripted and if Windows programmers have something familiar to work with there may be more of them to help. (You probably have better reasons than that one.) Use GHC as your C compiler, i.e. don't invoke CL directory from make, and add the INCLUDE/LIB directories to the RTS's package.conf. Certainly doable but it does present a conundrum: for the old GHC (without builtin cl-support) the order for compilation seems to be: compile/link command compile/link flags output source/object files other flags while for cl running link.exe or link.exe, it is better to put all the files at the end of the command line: compile/link command compile/link flags output other flags source/object files It also adds one more layer of indirection a that delicate stage. I am in the process of modifying and testing
Re: 64-bit windows version?
On Jun 25, 2007, at 12:06 PM, skaller wrote: On Mon, 2007-06-25 at 11:43 -0400, Peter Tanski wrote: It would be much better to have a single build system. I would gladly replace the whole thing ... I am thinking of starting a new project (possibly as sourceforge) to implement a new build system. I think Erick Tryzelaar might also be interested. The rule would be: it isn't just for GHC. So any interested people would have to thrash out what to implement it in, and the overall requirements and design ideas. My basic idea is that it should be generic and package based, that is, it does NOT include special purpose tools as might be required to build, say, Haskell programs: these are represented by 'plugin' components. A rough model of this: think Debian package manager, but for source code not binaries. I have been considering the same thing for some time, partly because the specification properties of most available build systems are terrible: XML is not a language (it is better as a back-end for a gui- based build system); current plug-in systems (similar to what WAF uses) are object-oriented and require a deep knowledge of the build system API; others are manual hacks. One big thing to avoid are cache-files: CMake, SCons, WAF, autotools, all use cache-files and all run into problems when the cache files aren't cleaned or include errors. (WAF has the best cache-file system--they are designed to be human-readable.) I will gladly lend support to this. An idea I have been kicking around is a hybrid between the autoconf strategy and the commercial-setup strategy: add support for a specification of program requirements, i.e., stdint.h, gcc/cl/icl, things like that in a simple spec-document with dead-simple syntax, then let the build system handle the rest--it would know what to do for each architecture. That seems similar to the Debian package maker, right? What language are you thinking about using? Scheme seems good but the build-language itself might be different; gui-support should be available which says to me (horrors!) Java AWT--a cross-platform gui- supported build system (not an IDE) would rock the world because it doesn't exist. There are tons of Python packages out there (A-A-P, hasn't been updated since 2003; SCons, WAF, Bakefile (uses Python)). I don't know if this is possible in Felix. Other requirements might be: (1) never alter anything in the source directories--everything is managed through the build directory (2) ability to easily build separate targets from the command line, similar to 'make test_shifts'--externally scriptable. One thing other systems seem to fail at is building off the enormous trove of information in autoconf--it's right there, open source, and they reinvent the wheel when it comes to making configuration tests or finding platform information (config.guess is essentially a database of platform-specific information but is somewhat dated with regard to newer systems, including OS X). On that note, a different approach to configuration tests might be clearer knowledge about the compilers: all these systems build small C programs to test certain compiler characteristics and test for a 0 exit value. Well, CMake does actually 'read' the output files for some things, such as compiler version. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
On Jun 25, 2007, at 12:55 PM, kyra wrote: Certainly doable but it does present a conundrum: for the old GHC (without builtin cl-support) the order for compilation seems to be: compile/link command compile/link flags output source/ object files other flags while for cl running link.exe or link.exe, it is better to put all the files at the end of the command line: compile/link command compile/link flags output other flags source/object files It also adds one more layer of indirection a that delicate stage. Maybe some gcc mimicing cl wrapper tailored specifically for GHC building system could help? One more layer of indirection, but could leave ghc driver relatively intact. That's a good idea! Do you know if or how the mingw-gcc is able to do that? Does mingw-gcc wrap link.exe? It sounds silly that someone relatively inexperienced with mingw should be doing this but it _really_needs doing and no one else seems to want it (besides, from my perspective, once I get through the build-system drudgery it lets me handle the fun stuff like adding inline MASM to the RTS, such as ghc/includes/SMP.h). Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
On Jun 25, 2007, at 3:34 PM, skaller wrote: On Mon, 2007-06-25 at 13:35 -0400, Peter Tanski wrote: Maybe some gcc mimicing cl wrapper tailored specifically for GHC building system could help? One more layer of indirection, but could leave ghc driver relatively intact. That's a good idea! Do you know if or how the mingw-gcc is able to do that? Does mingw-gcc wrap link.exe? There's more to portable building than the build system. For example, for C code, you need a system of macros to support void MYLIB_EXTERN f(); where MYLIB_EXTERN can be empty, say __declspec(dllexport) on Windows when building a DLL, and __declspec(dllimport) when using it. This is *mandatory*. Of course--one thing I would add to a build system, instead of compiling little C files and testing the return value to detect some compiler functionality, is the ability to read builtin macros, say, by telling the compiler to dump all macros like 'gcc -E -dM' and then read through the macros. As for the Windows-native build, I am pretty far long with that but the idea was to hijack the gcc executable with a script that would convert the gcc arguments to cl arguments. The one thing such a script would not do is compile everything at once. So far that is one thing I am adding to the Make system here: since dependancy generation is good for Haskell files but is not necessary for C files since I can bunch the C sources together with the compiler flags and pass them cl all at once in a command file. This should be faster than Make. The build system controls the command line switches that turn on We're building a DLL flag. A distinct macro is needed for every DLL. That is part of the modifications to the runtime system (RTS). In Felix, there is another switch which tells the source if the code is being built for static linkage or not: some macros change when you're linking symbols statically compared to using dlsym().. it's messy: the build system manages that too. Sometimes this is better in header files and change the macros with defines the build system passes to the c compiler but Felix's system is much more flexible than that (it builds the source files as interscript extracts them, right?). Building Ocaml, you have a choice of native or bytecode, and there are some differences. Probably many such things for each and every language and variation of just about anything .. eg OSX supports two kinds of dynamic libraries. GHC's interpreter (GHCi) does have to be built. I have not found a libReadline DLL, but I am sure I can scrounge something--possibly from Python since they had this same problem back around 2000. The point is that a 'Unix' oriented build script probably can't be adapted: Unix is different to Windows. The best way to adapt to Windows is to use Cygwin.. if you want a Windows native system, you have to build in the Windows way and make Windows choices. A silly example of that is that (at least in the past) Unix lets you link at link time against a shared library, whereas Windows requires to link against a static thunk .. so building a shared library produces TWO outputs on Windows. I am building with Mingw because that is better supported by the GHC build system (Cygwin is somewhat defunct); the end result should build from source in Visual Studio/Visual C++ Express. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
On Jun 22, 2007, at 9:45 AM, Simon Marlow wrote: skaller wrote: On Fri, 2007-06-22 at 12:03 +0100, Simon Marlow wrote: Ok, you clearly have looked at a lot more build systems than I have. So you think there's a shift from autoconf-style figure out the configuration by running tests to having a database of configuration settings for various platforms? I shouldn't overstate the situation: the other complete build systems, CMake and SCons do have autoconf capabilities in the way of finding headers and programs and checking test-compiles, the basic sanity checks--CMake has many more autoconf-like checks than SCons. Where they differ from the automake system seems to be their setup, which like Make has hard-coded settings for compilers, linkers, etc. (Some standard cmake settings are wrong for certain targets.) I don't know if you have any interest in pursuing or evaluating CMake (certainly not now) but the standard setup is stored in a standard directory on each platform, say, /usr/share/cmake-2.4/Modules/ Platform/$(platform).cmake and may be overridden by your own cmake file in, say, $(srcdir)/cmake/UserOverride.cmake. The preset-target-configuration build model I was referring to is a scaled-down version of the commercial practice which allows you to have a single system and simultaneously compile for many different architecture-platform combinations--once you have tested each and know how everything works. For the initial exploration, it is a different (more anal) strategy: before invading, get all the intelligence you can and prepare thoroughly. The GNU-Autoconf strategy is to keep a few troops who have already invaded many other places, adjust their all-purpose equipment a little for the mission and let them have at it. My gripe is that their equipment isn't very good. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
On Jun 22, 2007, at 7:03 AM, Simon Marlow wrote: In fact, to build a source distribution on Windows, there are only 3 dependencies: GHC, Mingw and (either MSYS or Cygwin). To build from darcs, you also need: darcs, Happy, and Alex. To build docs, you also need Haddock. To run the testsuite you need Python. True, Mingw does come standard with perl and a version of flex. There are Windows-native versions of Perl and flex available (i.e., ActivePerl). Now you are familiar with Mingw. Imagine being a standard Windows programmer, trying to choose which version of Mingw to download--some are minimal installations--and going over the build requirements: perl, flex, happy, alex and, haddock are listed. That is quite a bit of preparation. There are minimal-effort ways to go about this (I will look into updating the wiki.) Whatever the end result is, GHC must be able to operate without Mingw and the GNU toolset. That's the whole point of doing the port! For running GHC--how about being able to build a new version of GHC from source? 1. modify GHC so that: a) it can invoke CL instead of gcc to compile C files Mostly done (not completely tested). b) its native code generator can be used to create native .obj files, I think you kept the syntax the same and used YASM, the other alternative is to generate Intel/MS syntax and use MASM. This is as easy as simply using Yasm--also mostly done (not completely tested). By the way, by testing I mean doing more than a simple -optc... -optc... -optl... addition to the command line, although an initial build using a current mingw version of GHC may certainly do this. c) it can link a binary using the MS linker 2. modify Cabal so that it can use this GHC, and MS tools 3. modify the build system where necessary to know about .obj .lib etc. A bit invasive (it involves modifying the make rules so they take an object-suffix variable). Instead of the current suffix.mk: $(odir_)%.$(way_)o : %.hc it should be: $(odir_)%.$(way_)$(obj_sfx) : %.hc or some such. This may affect other builds, especially if for some reason autoconf can't determine the object-suffix for a platform, which is one reason I suggested a platform-specific settings file. I could handle this by having autoconf set the target variable, put all the windows-specific settings in a settings.mk file (including a suffix.mk copy) and have make include that file. 4. modify the core packages to use Win32 calls only (no mingw) That is where a lot of preparation is going. This is *much* harder to do from mingw than from VS tools since you have to set up all the paths manually. 5. Use the stage 1 GHC to compile the RTS and libraries 6. Build a stage 2 compiler: it will be a native binary 7. Build a binary distribution I told Torkil I would have a version of the replacement library available for him as soon as possible. I'll shut up now. It looks like a long weekend. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
On Jun 22, 2007, at 11:42 AM, Simon Marlow wrote: Peter Tanski wrote: A bit invasive (it involves modifying the make rules so they take an object-suffix variable). Instead of the current suffix.mk: $(odir_)%.$(way_)o : %.hc it should be: $(odir_)%.$(way_)$(obj_sfx) : %.hc or some such. This may affect other builds, especially if for some reason autoconf can't determine the object-suffix for a platform, which is one reason I suggested a platform-specific settings file. I could handle this by having autoconf set the target variable, put all the windows-specific settings in a settings.mk file (including a suffix.mk copy) and have make include that file. Surely this isn't hard? ifeq $(TargetOS) windows osuf=obj else osuf=o endif and then use $(osuf) wherever necessary. Yes it is easy but now all Makefiles must be changed to use $(osuf), such as this line in rts/Makefile: 378: %.$(way_)o : %.cmm $(H_FILES), for what will be a (hopefully) temporary Windows build. 4. modify the core packages to use Win32 calls only (no mingw) That is where a lot of preparation is going. This is *much* harder to do from mingw than from VS tools since you have to set up all the paths manually. I don't understand the last sentence - what paths? Perhaps I wasn't clear here: I'm talking about the foreign calls made by the base package and the other core packages; we can't call any functions provided by the mingw C runtime, we can only call Win32 functions. Similarly for the RTS. I have no idea how much needs to change here, but I hope not much. To use the MS tools with the standard C libraries and include directories, I must either gather the environment variables separately and pass them to cl/link on the command line or I must manually add them to my system environment (i.e., modify msys.bat, or the windows environment) so msys will use them in its environment. The other problem is the old no-pathnames-with-spaces in Make, since that must be made to quote all those environment variables when passing them to cl. I could use the Make-trick of filling the spaces with a character and removing that just before quoting but that is a real hack and not very reliable--it breaks $(word ...). Altogether it is a pain to get going and barely reproducible. That is why I suggested simply producing .hc files and building from .hc using VS. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
On Jun 21, 2007, at 4:16 AM, Simon Marlow wrote: Peter Tanski wrote: skaller wrote: Why do you need mingw? What's wrong with MSVC++? The largest problem is the build system: GHC uses autoconf with custom makefiles. So autoconf won't work with MSVC++, that is indeed a problem. But this doesn't mean we have to stop using Makefiles and GNU make - the rest of the build system will work fine, provided it's told about the different conventions for names of object files etc. I don't see a compelling enough reason to stop using GNU make. The build system doesn't even need to invoke CL directly, since we can use GHC as a driver (isn't this the way we agreed to do it before?). We use autoconf in a pretty limited way (no automake), so I don't think it will be hard to work around, even if we have to just hard- code all the configuration results for Windows. The make system does work well and must be kept in order to port GHC to a new posix platform--too many parallel projects (pun intended) work with the current system. I have not kept a good count of monthly configuration-based bugs but there are at least a few a month, for known platforms, including OS X (a significant user base) and Mingw. If I could change one feature of the current system I would set up a wiki page with specific build requirements (I mean location, program/library with function declaration), and for known systems use autoconf only to determine what the $(build) system is and to ensure those programs are available, then jump into make which would call pre-set configuration makefiles for that system. I spent a good amount of time writing the replacement library build system in GNU Make (min. 3.8--the current min is 3.79.1) to blend seamlessly with the current system. It does use a custom configure script written in Python (more consistently portable, no temporary files of any kind in $(srcdir))--John, that is where I originally used Interscript: to bake configuration settings into the setup files. The configuration determines what system it is on and the relative-path location of certain requirements if they are not already available--for testing the processor type and os support (when it can't read from something cool like /proc/cpuinfo) it does build small programs but all building is done in the build directory which may be located anywhere you want. It then sets those parameters for configuration files that already contain other presets for that platform; general guesses may go into the main GHC autoconf and I will keep them very simple (new architectures get the generic C library by default). I simply can't convince myself that it is better to use a guess-based system for architectures I already know, especially when it also makes cross-compiling more complex than necessary. For Windows it uses a VS project and calls that from a DOS-batch file (for setup parameters) so you can run it from the command line. What I hope you would agree on for Windows-GHC is a build that ran parallel to the autoconf-make system. Of course that would require some maintenance when things change in the main system but I could write update scripts for trivial changes; I believe anything more complex should be carefully checked in any case. VS is troublesome (its project files are written in XML, but that may be automated). If you would rather use a Make-like system I could do it in Jam and then you would add only a few extra Jamfiles to the current system. As a bonus either VS or Jam would reduce build times, especially re- build times, would and probably reduce the number of configuration bugs we see around here. I would not suggest CMake, SCons or WAF; John wisely advised against anything invasive. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version? (Haskell is a scripting language too!)
Brian Hulley wrote: To port GHC to a completely new platform, you'd of course need a Haskell compiler or interpreter already. However to bootstrap the process only a slow interpreter would be needed so as long as a portable pre-built bytecode version was available for download the only thing left to port would be a byte code interpreter which could just be a single file of c code. This was one void Yhc was designed to fill, especially by compiling to java bytecode. At the rate I work--if I'm the only one deconstructing the current build system--by the time I'm done the Yhc team will have everything running. Greg Fitzgerald wrote: I was trying to build something like this recently but hit a roadblock. ... Unfortunately, the imported module needs to have the line module X.Y.Zwhere, which means the file needs to be aware of its parent directories. I think that's too harsh a constraint, and makes it a pain to move things around (true in everyday Haskell projects with local modules too!). Have you looked at Neptune? It is a scriptable build system based on Jaskell, which allows dynamic imports. An example from the page at http://jaskell.codehaus.org/Jaskell+for +Neptune#JaskellforNeptune-90: //in script1.jsl 1+x where x = import {file=script2.jsl}; end You may imagine that the string for script2.jsl may be computed. Of course this sort of thing breaks the type system in Haskell and the result is more Make-like, but that is the tradeoff. Now why did I not try a build system using Neptune? Probably because I had already spent the last three weeks learning CMake, the pecularities of SCons, WAF (weird bugs!), m4 (I never had to write tests in Autoconf before or debug my own configure files), and higher level Make (so it would do what I can do in Jam)--I guess it got lost by the wayside... I was looking at things most people would either already know or would want to learn and that should already be available on new platforms. skaller wrote: The second is simply that dynamic typing is generally better for build systems, because it allows code to 'self-adapt'. There is a somewhat slow-going scheme-based-build-system project called Conjure, at http://home.gna.org/conjure/ but it only supports Linux and OS X. An alternative is to implement the build system in, say, Scheme, and then write a Scheme interpreter in Haskell. Scheme can self-adapt internally because its compiler is built-in. That is why I was looking into using SISC--it is self-contained and may even be distributed along with the source code (SISC itself is GPLv2 but that doesn't matter for _using_ it)--by default it looks in the current directory. The downside is the lack of a library with directed graphs. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
skaller wrote: On Tue, 2007-06-19 at 12:23 +0100, Simon Marlow wrote: Bulat Ziganshin wrote: Hello glasgow-haskell-users, are you plan to implement 64-bit windows GHC version? The main thing standing in the way of this is the lack of a 64- bit port of mingw. Why do you need mingw? What's wrong with MSVC++? The largest problem is the build system: GHC uses autoconf with custom makefiles. I have looked into porting the whole thing to a Visual Studio project, using SCons (unreliable), CMake (limited command line abilities--good for a one-shot build but really just a safe lowest-common-denominator version of Make), Waf (another python-based build system that started as a fork of SCons for the KDevelop changeover from Autotools) and Jam. I would prefer to use Jam but I'm afraid I would be the only one who would ever want to support it. Nothing has the auto-configuration abilities you (John) built into the Felix Interscript-based system but I do not porting the build system (at least) to Interscript would go over well with anyone else who wanted to maintain it and the build itself would require heavy customisation. I have tested all of these on a small scale (the replacement-Integer library). The best option seems to be to create a VS project (not trivial--lots of glue) so a user may also call that from Make (if under Mingw) or pure DOS. There is also some gcc-specific code in the RTS (inline assembler, use of extern inline, etc.) By the way, as of gcc-4.2 (I believe; I know it is true for gcc-4.3) the use of 'extern inline' now conforms strictly to the C99 standard so we will have to add the option '- fgnu-89-inline' to get the old behaviour back--'extern inline' is used in some of the headers. Converting those 'extern inline's to 'static inline' or best yet plain 'inline' would also solve the problem. Ian Taylor's message at http://gcc.gnu.org/ml/gcc/2006-11/ msg6.html describes this in greater detail; his proposal was implemented. I don't think we'll be able to drop the mingw route either, mainly because while the MS tools are free to download, they're not properly free, and we want to retain the ability to have a completely free distribution with no dependencies. I don't know of any completely free 64-bit compilers for Windows. The Intel compilers are free for 30-day evaluation but everything else is for Win32. For the base Win32-native port there are many compilers available but I have mostly worked on using CL and Yasm (assembler) as replacement back-end compilers for GHC. There are people that want a Cygwin port too; personally I think this is heading in the wrong direction, we want to be more native on Windows, using the native object format and interoperating directly with the native Windows tools. Cygwin has a real problem with gcc: it is far behind everything else (gcc-3.4.4, though Mingw isn't much better) and it doesn't look like that will change anytime soon. It is also only 32-bit, I believe. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: 64-bit windows version?
Simon Marlow wrote: GHC *developers* wouldn't be any better off either. You'd still need either Cygwin or MSYS for the build environment. There's no way I'm using MS build tools, ugh. The way I have it set up (so far) is as simple as running configure and make--all from the command line, under DOS or Mingw, although someone with VS tools may open up the VSproject in the IDE. Would that be o.k.? I am not particularly enamored with VS, myself but that may be a consequence of having a small monitor for my Windows machine and constantly comparing it to the Xcode/Emacs combination I normally use. The VS debugger *is* very good and helped me pick out some bugs in Yasm quickly--when I only really know how to use gdb. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Locating shared libraries
On Jun 19, 2007, at 4:05 AM, Simon Marlow wrote: Peter Tanski wrote: Now each GHC-Haskell-based program installer would search /usr/ local/lib for, say, libHSrts_dyn-6.6.1.dylib and install that version if necessary. What happens on uninstall?... That is why I think your idea was good: put everything into distinct directories. We are not intending to build-in any particular knowledge about where shared libraries are to be installed - that's up to the packager. Definitely. It would be non-portable if GHC baked the install directory into the shared library install_name (using libtool lingo) whenever a programmer (or Cabal) invoked ghc --make. With one exception - we have to pick a default for the .tar.bz2 binary distributions (and 'make install'), and the only default that makes sense is to install the shared libraries in the standard location, /usr/local/lib. Otherwise the system will not find them, and GHC itself will not run (assuming it is dynamically linked). You don't get good uninstall support, but that's always been the way if you don't use the system package manager. I advocated putting everything in /usr/local/lib/ghc/ghc-$(version) earlier. The dynamic-library system used for ghc-6.4 on OS X worked fine; do you remember any problems when that was put together? Stefan O'Rear seemed against flooding /usr/local/lib with ghc- libraries--I'll admit my own /usr/local/lib is a bit messy even considering I use 'port' for quite a few programs--but also argued that the dynamic libraries should not go in the /usr/local/lib/ghc-$ (version). The de-facto standard for systems that have C or C++- compliant dynamic libraries seems to be: shared libraries go in: /usr/local/lib static libraries or system-specific libraries go in: /usr/local/lib/$(system)/$(system_version) or $(build)/$(system_version) So for nhc98, the static libraries are in /usr/local/lib/nhc98/$ (build); for yhc, the .ycr files are in /usr/local/lib/yhc/packages/ yhc-base/$(yhc_version); for felix the .flx files (as source code) are in /usr/local/lib/felix/$(felix_version)/lib; for ocaml, the .cmx and .cmi files go in /usr/local/lib/ocaml; but for chicken-scheme the dynamic libraries (only really usable through the chicken interface but definitely pure C in the end) are in /usr/local/lib. I should not neglect to say the same goes for python, although . The one exception seems to be for gcc's libstdc++, which has a symlink in the same directory as the static libraries. Following what I--perhaps mistakenly--called the 'de facto' standard, if ghc dynamic libraries are callable from C (they are), then the dynamic libraries should go into /usr/local/lib. I would highly suggest that symlinking an un- versioned name to each version would create a mess, so the library names should only follow the real version. Stefan does have a point, so the default installer might place dynamic libraries in a user library directory such as $(home)/lib--a real consideration for students and others who work on a large shared system where the sysadmin does not want to support yet another language installation. What seems backwards (to me) are the Haskell programs: it would be fine if the standard install for program static libraries and interfaces went into the ghc-$(version) directory but they don't and when we had dynamic libraries on OS X they followed the static library convention: each program is installed into /usr/local/lib/$ (program) or $(program)-$(version). Some programs place libraries directly into the program directory while others place the libraries under a $(haskell_compiler)-$(haskell_compiler_version) directory. This duplicates the ghc system of /usr/local/lib/ghc-$(version) for each Haskell program and creates a real mess--more so than other languages. I agree, this is not really GHC's problem but the ghc location might help, which is why I suggested /usr/local/lib/ghc/ghc-$ (version). It might even be extendable to all of Haskell: ghc should go into /usr/local/lib/haskell/ghc-$(version), so, say, yhc could go into /usr/local/lib/haskell/yhc and the installed programs would go into /usr/local/lib/haskell/$(compiler). Much cleaner and much easier for a package system to manage. I have written too much on this, so I'll shut up--whatever you decide is fine; I'll fix the install script to create a PackageMaker .pkg following whatever you decide and post it if you want it. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Locating shared libraries
skaller wrote: On Fri, 2007-06-15 at 19:40 -0500, Spencer Janssen wrote: On Sat, 16 Jun 2007 08:21:50 +1000 skaller [EMAIL PROTECTED] wrote: One way to measure this is: if you removed GHC and applications, and there are (necessarily) no users of the remaining library package .. the library package shouldn't be in the global public place (/usr/lib+include etc). As I understand it, the entire point of this effort (shared libraries in GHC) is to allow dynamically linked Haskell executables. In this case, applications outside the GHC toolchain will in fact depend on these shared objects. As a concrete case, a binary darcs package could be a user of libghc66-base.so and libghc66-mtl.so -- with no dependencies on the GHC compiler package itself. Does this pass your litmus test? Yes, it passes the separability test. My darcs wouldn't run otherwise! And versioning the library filename as above is a good idea too. Felix adds _dynamic for shared libs and _static for static link archives to ensure the Linux linker doesn't get confused. However, the libs still aren't fully public if the interfaces are only private details of the GHC tool chain. Hmmm. Under the current system, darcs is linked statically so it is a self- contained executable. Under the proposed shared-library system versioning the shared libraries may pose a very big problem and I don't think it would pass your litmus test. As I mentioned previously, GHC is a fast-moving target and there are quite a few GHC- Haskell-based applications available that rely on different versions of the compiler. Here is an example: There are any number of GHC-Haskell-based programs, all built with different versions of GHC; GHC itself relies on some of these to build, such as Happy and Alex. (There are .rpm packages for Alex.) You already talked about the situation from there: several different programs relying on different versions of the shared libraries, located in /usr/local/lib--and they don't rely on just one library. As is, the GHC runtime system has more than a few base libraries, the minimum set of which is: HSrts HSbase HSbase_cbits With dynamically linked libraries on OS X, the convention was to have add the suffix _dyn, so we had: HSrts_dyn, HSbase_dyn and HSbase_cbits_dyn Now each GHC-Haskell-based program installer would search /usr/local/ lib for, say, libHSrts_dyn-6.6.1.dylib and install that version if necessary. What happens on uninstall? The same thing you get on Windows when you have another program using a particular .DLL--the uninstall of that version of the library would fail but for unix systems _only_ if you also have another program using at while you are doing the uninstall. So if you did not break everything on each install, eventually you have a complete mess of different versions of GHC libraries in /usr/local/lib that may have no current use but at one time were used for several GHC-Haskell-based programs that have now been upgraded to use something different. Hopefully those who distributed the binary programs adopted a convention of using the full version of the library instead of symlinking libHSrts_dyn-6.6.1 to libHSrts_dyn, or as a user several of your older programs might break after a later one installed a new version of the library and symlinked that the new version... That is why I think your idea was good: put everything into distinct directories. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Locating shared libraries
On Jun 18, 2007, at 6:06 PM, Stefan O'Rear wrote: On Mon, Jun 18, 2007 at 11:56:57AM -0400, Peter Tanski wrote: Now each GHC-Haskell-based program installer would search /usr/local/ lib for, say, libHSrts_dyn-6.6.1.dylib and install that version if necessary. What happens on uninstall? ... That is why I think your idea was good: put everything into distinct directories. Debian's high level package manager will automatically garbage collect dependancies, such that every package on your system is either manually selected for install or a dependant of one. Thus there is no library leak problem. The Debian package system (I assume you are referring to apt, or maybe dpkg?) is very convenient (and relatively quick!). I don't know how well the BSD-style package systems garbage collect but I have not had any problems using port or fink--though sometimes they pull more dependancies than they need because installed programs outside their system are (wisely) not considered. Since my best machine is a Mac (PowerBook G4, 1.25GHz) the lost libraries become a problem as soon as I work independently of a package system. I expect the same would be true for anyone working independently. Some individual languages have their own package systems, such as Lisp's asdf; OCaml has GODI. Haskell is moving in that direction with Cabal but it is still more of a build system than anything else and in any case it would probably be better to keep with the system packaging setup, such as Debian's, for program distribution. That way GHC libraries may reside comfortably in the standard system directories but I am still a little querulous. I would rather all haskell libraries use the standard (OS X) DYLD_LIBRARY_PATH, though DYLD_FALLBACK_LIBRARY_PATH (putting things in $(home)/lib) might help for the haskell-universe. That may be the solution for programs, though it requires more work: Haskell would need packaging. On a related note, if anyone is interested, once we get an idea of where we want to put things I can modify the installer script to create an OS X pkg (using PackageMaker) for GHC binary distributions. At least that would make removal easier (using Desinstaller). The old 6.4.1 OS X application install was slick; I might redo that. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Locating shared libraries
Spencer Janssen wrote: On Sat, 16 Jun 2007 08:21:50 +1000 skaller [EMAIL PROTECTED] wrote: One way to measure this is: if you removed GHC and applications, and there are (necessarily) no users of the remaining library package .. the library package shouldn't be in the global public place (/usr/lib+include etc). As I understand it, the entire point of this effort (shared libraries in GHC) is to allow dynamically linked Haskell executables. In this case, applications outside the GHC toolchain will in fact depend on these shared objects. As a concrete case, a binary darcs package could be a user of libghc66-base.so and libghc66-mtl.so -- with no dependencies on the GHC compiler package itself. Does this pass your litmus test? If a binary darcs package requires GHC libraries to function it would be a GHC application, so it would not pass the test. John Skaller's point is actually very good, though it would require special changes to the current system: distinguishing between GHC- compiled _shared_libraries_ and the GHC compiler itself. Under the current system ghc sits in its own subdirectory, typically / usr/local/lib, /opt/local/lib, etc., depending on your distribution, with only the compiler and base libraries. Relatively few Cabal packages install into this directory--HaXml does, if i remember correctly. Most Cabal packagers leave the default install, so applications compiled by GHC are installed as subdirectories in the user library directory (/usr/local/lib). So we have this layout: /usr/local/lib/ghc-6.6.1 --compiler, ghc-distribution libraries /usr/local/lib/ghc-6.6.1/imports -- .hi files after installing HaXml: /usr/local/lib/ghc-6.6.1 - HaXml libraries /usr/local/lib/ghc-6.6.1/imports -- HaXml .hi files after installing a default Cabal distribution (here, HXT): /usr/local/lib/hxt-$(version) -- libraries /usr/local/lib/hxt-$(version)/imports -- hxt .hi files With the current setup is it is a real pain to go through and uninstall all those extra libraries when you upgrade GHC: you must either delete the directories completely (for directories based on version, such as hxt-$(version), above); or, you must go into each directory and delete the old libraries contained there. The ghc-pkg program only handles finding these libraries for the ghc compile process; it does not help with uninstalling them except by tracking location. I would extend John's idea--and reference to gcc's install layout: everything should go into the generic ghc directory, say, /usr/ local/lib/ghc; _under_ that should go the compiler version (including arch-platform); _under_ that may go user-libraries. That structure would make it much easier for users to uninstall or upgrade ghc versions. It also gives distributors the flexibility to include a ghc-library-only distribution (no compiler) for packaging applications--say, a basic darcs install (with shared libraries) includes only the shared libraries it needs. A basic ghc directory on top--no version until the next subdirectory--also preserves prime real estate toward the top levels of the directory structure: a user may install more than one version of GHC into their home directory without flooding their home directory with ghc-$(version)s. There is a fundamental flaw with all these new languages: ABI changes between compiler versions force software vendors to give customers upgrades of the entire program--you can't send a patch if you need to compile all the libraries again. Static libraries are horrible for the same problem, and one of the main reasons I will not use OCaml for any distributable application. Even in-house, a hassle like that means time and lost productivity. Patches are inevitable for new features, security, etc., and changes to the compiler are inevitable just to get around bugs in the compiler! GHC is a research effort so it is a relatively fast moving target--how much Haskell code has bit rotted simply due to changes in the GHC compiler? In the past _three_ years? How much simply due to changes in the _interface_ file (.hi) format? A lot. Too much to mention here. Even if we can't standardise the ABI a bit the solution of where to put libraries should place some real consideration on how much time and trouble it is to uninstall and recompile everything, without resorting to a script or package system to do it for us--you may imagine the bug bath from, say, modifying ghc-pkg to handle uninstallation. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: GHC on Cray XMT?
Chad Scherrer wrote: The new Cray XMT seems to target exactly the same kind of analytic problems where Haskell excels: http://www.cray.com/products/xmt/ I'm not a hardware guy, but this seems like a natural fit with the recent advances in SMP Haskell. I know some compiler experts, but none of them have experience with Haskell. Can anyone tell me how big an undertaking it would this be to get GHC going on one of these? It seems to me like the fact that the absence of cache on XMT processors should simplify things, since there are less issues reasoning about locality. Porting GHC to a new architecture may be easy or very involved. See http://hackage.haskell.org/trac/ghc/wiki/Building/Porting. Porting to the Cray XMT would appear to be very difficult but a real Plum!. I am not sure how close its UNICOS operating system is to BSD (GHC has ports to FreeBSD and OpenBSD)--that is just the main programming environment; working with the processors may require a smaller microkernel environment. According to the XMT Datasheet the main programming is done on Linux-based nodes. Linux is GHC's home OS, so that would seem to make it a little easier. For an experience porting Glasgow Parallel Haskell (GPH) to an MPP, you might want look at the work of Kei Davis (http://www.c3.lanl.gov/ ~kei/), who ported an older version of GHC to the Thinking Machines CM-5 SPMD (http://www.c3.lanl.gov/~kei/lanlwork.html and especially, MPP Parallel Haskell (Preliminary Draft), 1996, at http:// www.c3.lanl.gov/~kei/ifl96.ps). Kei Davis's work is also referenced on the Glasgow Parallel Haskell site, under PORTS (http:// www.macs.hw.ac.uk/~dsg/gph/). I mention his work in particular because the CM-5 port involved a slightly new (Solaris-like) OS and a specialised message passing system, CMMD, instead of PVM used by Glasgow Parallel Haskell. That would be quit a lot of work since you would have to modify the RTS--the old GUM (Parallel Haskell support) stuff is still in there but it is somewhat bit-rotted. The amount of work required might get worse or better as GHC's future is native assembly output--worse, because you would have to work through the assembly code and efficient C-code generation would require more work with a nasty Perl script called the Mangler; better, depending on whether someone essentially rewrites the ugly spider-web of code in GHC's Native Code Generator (NCG). But all this stuff is for a native port: if you have access to an XMT, you might want to simply follow the directions on Porting GHC to a new platform (http://hackage.haskell.org/trac/ghc/wiki/Building/ Porting#PortingGHCtoanewplatform) and see what problems you run into. Cheers, Peter Tanski ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: status of GreenCard support
Jefferson Heard wrote: some simple fix that would make code generated by GreenCard work, or is there some alternative tool that will make it work? What's the status of C2HS? Marc Weber wrote: When using cabal and GreenCard it might be the case that you have to tweak setup.hs to add --libdir parameters. Probably it would be better to update cabal. Maybe I've missed a point to configure it correctly? If I can save you some minutes by dropping you me quick setup.hs hack let me know it. I should have answered this in before, sorry. GHC has not supported GreenCard since version 6.2. See http://www.haskell.org/haskellwiki/ Libraries_and_tools/Interfacing_other_languages and http://www.mail-archive.com/haskell@haskell.org/msg14004.html I ran into that problem while attempting to get gcjni-1.2 to work with GHC 6.4 awhile ago, got annoyed, and appended the warning to the Haskell Wiki page lest others ran into the same problem. The Alastair Reid's successor to GreenCard, based on Template Haskell, hasn't been finished yet (or perhaps it is dead?). For more automatic configuration, you might consider Template Haskell: see Ian Lynagh's Template Haskell: A Report From the Field, at http:// web.comlab.ox.ac.uk/oucl/work/ian.lynagh/papers/ . There are still configuration parameters in the GHC build system for GreenCard but from what I have seen of the rest it seems that you would have to write your own support for the old _casm_ ffi bindings back into the source code practically from scratch (perhaps grabbing an archive of 6.1 would give you a better start). Cheers, Peter Tanski ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: HEAD ghci crash on ppc OS X
On 16 Jan 2007, David Kirkman wrote: On 1/15/07, David Kirkman [EMAIL PROTECTED] wrote: The comment above seems to be begging for a oc-image += oc-misalignment; statement right after the call to stgMallocBytes. This did in fact work (patch for rts/Linker.c below). And it *was* in the 6.6 sources, so it was removed in the last few months. Putting the line back in seems to fix everything on ppc darwin: my code which uses ghc as a library to do a bunch of dynamic compilation and loading works again, and the test suite looks pretty much like the results from the latest linux builds (http://www.haskell.org/ghc/dist/current/logs/x86_64-unknown-linux- test-summary) Awesome! (I bet you haven't heard that exclamation since the '80s.) I can only get a partial darcs repository, so I can't figure out why the line was removed -- presumably to fix something that my patch will proceed to rebreak :) I doubt you would re-break anything: the change was created by a patch fixing Bug #885. If you look at the .dpatch in the listing for #885, the patch records at lines 329-335: hunk ./rts/Linker.c 1379 - oc-image += misalignment; - +# else + oc-image = stgMallocBytes(oc-fileSize, loadObj(image)); +# endif + but did not add your change back in. Cheers, Peter Tanski ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Problems building GHC
Hello, Rodrigo Geraldo wrote: I start to hack GHC, and, I 've tried to build it, but the build return this error message : configure: GMP_CHECK_ASM_W32: fatal: do not know how to define a 32- bit word make[1]: *** [boot] Error 1 I've tried to build GHC using mSYS and MinGW under MS windows xp... This is an error from the internal GMP library's configure file, $ (top_level)/rts/gmp/configure. If you don't have gmp installed already, the GHC build system tries to build it for you. The build system for gmp is very delicate so I would suggest downloading a separate gmp library and installing that in your /usr/lib directory (for MinGW). The gmp library and header file for MinGW, avaliable from http://cs.nyu.edu/exact/core/gmp/ under 'gmp-static- mingw-4.1.tar.gz'. Once you have the library installed properly, the GHC build system will find it and won't attempt to build gmp. I know it sounds like a workaround--we really ought to ensure the gmp build itself works--but in this case it is the gmp build that fails (swox.org's problem, not ours) and the current gmp system should be replaced soon by another library. (I am working on that but for lack of time I am running a bit late.) Cheers, Peter Tanski ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
rfc: colorized Haskell on GHC-Trac
To anyone who has an interest in reading GHC-Trac: In an idle-procrastination moment I started to experiment with html- colour for Haskell code (using hscolour) on a few Trac pages: http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes (some blocks, note Cyan-colour for literal integrals (1)) http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes/ TheCurrentGMPImplementation (one-line snippets) http://hackage.haskell.org/trac/ghc/wiki/Commentary/Compiler/CmmType (large blocks) What do you think? Better? Worse? Would you rather see a silver/grey background as for other code? Personally I tend to prefer the Orange for top-level function definitions but don't quite know about orange- colour functions as they are used. Most of this is close to emacs- colour codes. For those who are interested in playing around, the easy way to do this is: demarcate a code block: {{{ #!html pre font color=OrangetopLevelFunction/font ... /pre }}} Thanks, Peter Tanski ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Current GMP implementation
Hello Thorkil, I updated the wiki Replacing GMP/The Current GMP Implementation page with some introductory notes. I hope the pretty graphic cuts down on a lot of wording necessary. There is no discussion of the Cmm implementation as that is contained in Esa's posts to the GHC users list or in the PrimOps pages in the new Commentary. (Maybe I should move the GMP-Cmm-Haskell discussion into The Current GMP Implementation page but I really have to get some honest coding done.) Cheers, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
On Dec 29, 2006, at 8:32 AM, Thorkil Naur wrote: On Friday 01 September 2006 06:43, Peter Tanski wrote: ... For a brief overview of the speed of the libraries I looked at carefully, see http://hackage.haskell.org/trac/ghc/wiki/ ReplacingGMPNotes (I added a few charts to show the speed of ARPREC, OpenSSL's BN, GMP and LibTomMath. ) I tested GMP and OpenSSL's BN for reference. I must confess that it took me a while to understand what you were doing here and I am still uncertain: For example, how many multiplications were actually performed for bit size 4096? 4096 = size 4 (counting from zero: sizes[5] = {256,512,1024,2048,4096}) 1,000,000 / ( 4 * 3 ) = 83,333 rounds My reason for doing this was simple: as a basic comparison, rather than an absolute measurement, the number of rounds doesn't matter as long as the results are measurable. Besides, more rounds means more time running each test. I did a lot of tweaking, especially with ARPREC, to get each library to (1) a generally available speed and (2) a configuration similar to what it would be when used with GHC. I could have measured the speed in nanoseconds, with one iteration for each calculation using random numbers of a specified size and posting the mean for a number of trials but that would have required me to use OS-X specific profiling software like Shark in order to get reliable measurements--a bit more labour intensive as it would require me to manually perform each test. (I did use random numbers of a specified size.) In addition, for Powers, the markings on the horizontal axis (256 pow(n,3), 512 pow(n,4), 1024 pow (n5) (missing a , here?), ...) on your graph seem to indicate that you are changing two different variables (the bit size and the exponent) at the same time. Yes, the testing is a bit sloppy there (so is the graph; ugly typo). The graph shows a general trend more than anything else. I actually tested the Exponentials (Powers) individually for each size and timing but posting a 3-D graph or making the graph (time = exponent/ size) seemed like overkill or would conflict with the previous measurements. Not a bad idea, though, just for clarity. I would suggest that you also quoted the original measurements directly. And perhaps (especially in case of the Powers and Division) some more details about what was actually measured. I did quote the original measurements directly. There wasn't much variance overall and I took what seemed like median results from a number of tests. What matters is the relative time to initialise and perform each computation since in a GHC-implementation each computation would require some minimal initialisation. ARPREC was built for this and in ARPREC-only tests the major factor in speed of initialisation was the time to compute the architecture and precision- specific constants for PI, Log_2 and Log_10 (the Epsilon constant doesn't require much time). Log_2 and Log_10 are necessary for printing Integers because computations in ARPREC are performed as floating-point values and must converted to decimal digits by multiplying by Log_10. (Note that printing Integers also requires a size increase as the mantissa holds the Integer value, requiring further multiplication by the float-exponent.) Details on differences between algorithms used in each library would be fairly complex: as you already know, each library (ARPREC, GMP, LibTomMath, etc.) uses a different algorithm based on the size of the number-array *and* each may have a different implementation of an algorithm--LibTomMath uses a simpler Karatsuba algorithm, for example. It is distinctly odd that squaring seems to be more expensive than multiplication for some bit sizes in 3 of the 4 measured cases. This is also an implementation matter: the particular algorithms change with size and squaring may require some extra initialisation for, say, computing the size of the result and the number of operations. All of the libraries provide special squaring functions and I used those. LibTomMath is a good example: it uses its baseline squaring algorithm for small sizes and Comba-optimised Toom-Cook or Karatsuba algorithms for larger sizes. (I purposely did not tune LibTomMath or any of the libraries because I wanted a more- or-less average-case comparison, so the Karatsuba algorithm was used for size=512 bytes (128 digits) and Toom-Cook was used for size=2048 bytes (512 digits).) So where you see LibTomMath's time dip in the middle of the 'Squaring' graph it is using the Karatsuba algorithm. ARPREC uses a FFT for everything above size=256 and calculates with fewer actual digits (it stores extra size as an exponent, just like ordinary floating point numbers). The trends in the graphs for ARPREC versus GMP generally hold true until GMP's FFT kicks in, at size 32768. Also, I wonder what divisions
Re: bignums, gmp, bytestring, .. ?
On Nov 22, 2006, at 8:39 PM, Donald Bruce Stewart wrote: p.tanski: to my knowledge, that would be the first Haskell implementation for PalmOS... Pretty sure Tony Sloane et al at Macquarie have had nhc98 running on the palm for quite a while. They've recently moved to YHC though, iirc. Donald, Thanks for tip! Have you seen any of this code yourself? Jeremy, Dr. Sloane would be a good person to contact. For reference, the last work on this seems to have been done in 2005. Tony Sloane (asloane -at- ics dot mq dot edu dot au). Patroklos Argyroundis did a port of GMP version 3.1.1 to WinCE, noted at http://ntrg.cs.tcd.ie/~argp/software/ntrg-gmp-wince.html . I don't know if that might give any good pointers. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
On Nov 19, 2006, at 3:20 PM, Jeremy Shaw wrote: At Sun, 19 Nov 2006 13:46:10 -0500, Peter Tanski wrote: What is the problem building GMP for PalmOS? According to the GMP install documentation, it supports ARM and Motorola's m68k processors, so you would not be using generic C code. You are probably also using PRC-Tools, correct? Yes. I can not get past the configure step. I tried to build gmp 4.2.1 with prc tools 2.3. I ran configure with these options: ./configure --build=i386-linux-gnu --host=m68k-palmos But, all the test programs (conftest.c) fail to link because they use 'main ()', but palmos expects 'PilotMain ()'. I hack the configure script and changed all occurances of 'main ()' to 'PilotMain ()', but then it failed beacuse the test programs could not find MemoryMgr.h. So I invoked configure with: CFLAGS=-I=/root/n-heptane/projects/haskell/palmos/sdk-5r3/ include/ ./configure --build=i386-linux-gnu --host=m68k-palmos But now it fails to find working compiler with this error (from config.log): configure:7756: checking build system compiler m68k-palmos-gcc -I=/ root/n-heptane/projects/haskell/palmos/sdk-5r3/include/ configure:7769: m68k-palmos-gcc -I=/root/n-heptane/projects/haskell/ palmos/sdk-5r3/include/ conftest.c /usr/lib/gcc-lib/m68k-palmos/2.95.3-kgpd/libgcc.a(_exit.o)(.text +0x10): In function `exit': libgcc2.c: undefined reference to `_cleanup' /usr/lib/gcc-lib/m68k-palmos/2.95.3-kgpd/libgcc.a(_exit.o)(.text +0x16):libgcc2.c: undefined reference to `_exit' collect2: ld returned 1 exit status Did you set LDFLAGS to point to the prc-tools directory and set the first available 'ld' executable to the prc-tools 'ld'? I would help more but I am using darwinports to build prc-tools (I have not experience with prc-tools or PalmOS) and it fails to build (partly because Apple gcc 4.0.1 defaults to a dynamic C++ library, so prebinding fails... there is an easy workaround so once I get enough CPU-time to rebuild it I will try poking around some more. And, around this time, my interest in running yhi on PalmOS starts to wane. Awww... to my knowledge, that would be the first Haskell implementation for PalmOS :) As I mentioned in a prior email, there is a Haskell arbitrary precision number library (BigFloat, at http:// bignum.sourceforge.net/). You might adapt it for Integer and add it to Yhc if nothing else works. I'm not crazy about BigFloat's performance, though. Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
Hi Jeremy, On Nov 17, 2006, at 10:34 PM, Jeremy Shaw wrote: At Sat, 18 Nov 2006 00:44:32 +, Neil Mitchell wrote One advantage you probably haven't thought of is the size of the binary. ... On a related note -- dropping the gmp requirement would also make it easier to port yhc to non-unix platforms. I have tried on a few occasions to compile the yhc runtime for PalmOS, but I can never get gmp built for PalmOS. What is the problem building GMP for PalmOS? According to the GMP install documentation, it supports ARM and Motorola's m68k processors, so you would not be using generic C code. You are probably also using PRC-Tools, correct? Cheers, Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: bignums, gmp, bytestring, .. ?
On Nov 17, 2006, at 7:24 PM, Claus Reinke wrote: it seems that haskell versions of bignums is pretty much gone from more recent discussions of gmp replacements. now, I assume that there are lots of optimizations that keep gmp popular that one wouldn't want to have to reproduce, so that a haskell variant might not be competitive even if one had an efficient representation, but - do all those who want to distribute binaries, but not dynamically linked, need bignums? You are right: most don't. Even when working with larger numbers, I have very rarely used bignum libraries myself, mostly because there is usually a clever--and often faster--way to deal with large numbers, especially when you don't require all that extra precision. These methods were better known and relatively widely used before multi-precision libraries became so widespread and have even become more useful since 64-bit machines and C99's 64-bit ints came around. Integers are mostly a convenience. Large numbers are necessary if you need very high precision mathematical calculations or if you are doing cryptography; for that matter, high precision mathematics usually benefits more from arbitrary precision decimal (fixed or floating point) for certain calculations. The simple problem with Haskell and Integer is that, according to the Standard, Integer is a primitive: it is consequently implemented as part of the runtime system (RTS), not the Prelude or any library (though the interface to Integer is in the base library). For GHC, compiling with -fno-implicit-prelude and explicitly importing only those functions and types you need the won't get rid of Integer. Possible solutions would be to implement the Integer 'primitive' as a separate library and import it into the Prelude or base libraries, then perform an optimisation step where base functions are only linked in when needed. Except for the optimisation step, this actually makes the job easier since Integer functions would be called using the FFI and held in ForeignPtrs. (I have already done the FFI- thing for other libraries and a primitive version of the replacement.) - it would be nice to know just how far off a good haskell version would be performance-wise.. There is actually a relatively recent (2005, revised) Haskell version of an old Miranda library for infinite precision floating point numbers by Martin Guy, called BigFloat, at http:// bignum.sourceforge.net/. Of course, it is floating point and Integers would be faster but the general speed difference between the two would probably be proportional to the speed difference in C and so would be just as disappointing. The BigFloat library (using the Haskell version) came in last place at the Many Digits Friendly Competition for 2005 (see http://www.cs.ru.nl/~milad/manydigits/ final_timings_rankings.html), though you would probably be more interested in looking at the actual timing results to get a better idea. (The fastest competitors were MPFR, which uses GMP, and The Wolfram Team, makers of Mathematica; BigFloat actually beat iRRAM and Maple solutions for several problems.) The real problem with an Integer library written in *pure* Haskell-- especially with Integers--is simple: Haskell is too high-level and no current Haskell compiler, even JHC, has even remotely decent support for low-level optimisations such as being able to unroll a loop over two arrays of uint32_t and immediately carry the result from adding the first elements from each array to the addition of the next two, in two machine instructions. I shouldn't have to mention parallelization of operations. In short, if you look at general assembler produced from any Haskell compiler, it is *very* ugly and Arrays are even uglier. (For a simple comparison to Integer problems, try implementing a fast bucket sort in Haskell.) GMP uses hand-written assembler routines for many supported architectures, partly because GMP was originally created for earlier versions of GCC which could not optimise as well as current versions. Even GMP cannot compare to an optimised library using SIMD (Altivec, SSE)--in my tests, SIMD-optimised algorithms are between 2x to 10x faster. SIMD and small assembler routines (especially for architectures without SIMD, especially) are what I have been doing the bulk of my work on. I doubt I have the ability to extend the current state of the art with regard to higher-level polynomial optimisations, so I am always trying out any algorithm I can find. (For very high precision multiplication (more than 30,000 bits), not much beats a SIMD-enabled Fast Fourier Transform; a specially coded Toom-3 algorithm would be faster but for very large operands the algorithm becomes prohibitively complex. Division is another labour- intensive area.) - what would be a killer for numerical programming, might still
Update to Replacement GMP page
Hello all, I made another update to the notes on Replacing GMP, at http:// hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes . It's pretty basic and you'd probably find it shabby, but comments, modifications appreciated. I am still in the throes of trying to *portably* beat GMP for speed and accuracy. (The portability problem comes in when I may no longer rely on Altivec/SSE/3DNow! support on other processors and need to optimise in pure C.) -Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Change Data.Bits.rotate to rotate Integer (unbounded) types
I don't have a particular implementation in mind but as a general idea it would make the treatment of Integers the same as the treatment of the standard-size bounded ints. A possible implementation might be a stream cipher that uses 128-bit Integers instead of 32-bit ints (bitwise rotations have been used in more than a few stream ciphers). For arithmetic purposes, rotation is also useful for implementing multiplication of finite fields. -Pete On Sep 19, 2006, at 3:03 AM, Lennart Augustsson wrote: And what would rotating an Integer mean? The only sensible interpretation I can think of is to make it behave like shift. On Sep 18, 2006, at 23:46 , Peter Tanski wrote: Welcome back! Since Data.Bits is not defined in the Haskell 1998 standard, are we free to change the implementation of Data.Bits? if we are free to change the implementation of Data.Bits, would it be all right to change the operation of rotate, rotateL and rotateR over unbounded types (to my knowledge, currently only Integer)? I would like to change rotate, rotateL and rotateR to actually rotate (not shift) Integers. -Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Change Data.Bits.rotate to rotate Integer (unbounded) types
On Sep 19, 2006, at 3:38 PM, Lemmih wrote: On 9/19/06, Peter Tanski [EMAIL PROTECTED] wrote: I don't have a particular implementation in mind but as a general idea it would make the treatment of Integers the same as the treatment of the standard-size bounded ints. A possible implementation might be a stream cipher that uses 128-bit Integers instead of 32-bit ints (bitwise rotations have been used in more than a few stream ciphers). For arithmetic purposes, rotation is also useful for implementing multiplication of finite fields. Ah, so you want to rotate various bounded integers larger than 64bits? You can do that without changing Data.Bits at all (crypto defines Word128, Word192 and Word256 which are instances of Bits). The LargeWord module in Crypto is very cool. Before this email I did not know LargeWord defined rotate (maybe it is the version of Crypto I have--3.03?). -Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Change Data.Bits.rotate to rotate Integer (unbounded) types
On Sep 19, 2006, at 3:28 PM, Neil Mitchell wrote: Hi, Welcome back! Since Data.Bits is not defined in the Haskell 1998 standard, are we free to change the implementation of Data.Bits? No! If you do things like this, randomly changing the semantics of functions, people will come round to your house with burning pitch forks! Or, if they reside across the Water, they might simply refuse to use my program. The problem with Data.Bits.rotate, rotateL and rotateR for Integers is redundancy: they are the same functions as shift, shiftL and shiftR respectively. The unfortunate (and possibly buggy) consequence for the unwary might be an unexpected change in the operation of a function that uses rotate, rotateL or rotateR over types in class Num when their Int32's have been promoted to Integers to cover overflow. Otherwise it would be much easier to simply leave it as is (for an array of doubles, where bitwise operations are actually performed arithmetically, rotations would be difficult). If you want to have functions with new semantics, its probably a good idea to give them new names, or do something else to stop changing existing programs. Certainly. The reason I asked about Data.Bits was that it is not defined in the Haskell98 standard--I couldn't add IntegerRotate to Prelude, and Data.Bits is a library. In any case, I think the general response is that (a) specious idea or (b) already done in other libraries. -Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Change Data.Bits.rotate to rotate Integer (unbounded) types
Welcome back! Since Data.Bits is not defined in the Haskell 1998 standard, are we free to change the implementation of Data.Bits? if we are free to change the implementation of Data.Bits, would it be all right to change the operation of rotate, rotateL and rotateR over unbounded types (to my knowledge, currently only Integer)? I would like to change rotate, rotateL and rotateR to actually rotate (not shift) Integers. -Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Help with unregisterised build
Hello Jeremy,I don't know if anyone has gotten back to you on this yet. I have not myself done an unregisterised build of GHC before, but I thought you should at least hear something. The people who know about this all seem to be at the ICFP programme that has been going on since Thursday; they should be back next week. In the meantime, some work has been done recently on the way GHC puts out assembler--not quite the same problem as a multiply-defined-symbol error--but you might consider grabbing darcs and trying to build from the HEAD branch. See http://hackage.haskell.org/trac/ghc/wiki/Building/GettingTheSources.  -PeteJeremy Wazny wrote:I've attempted to build an installation of GHC which usesunregisterised libraries, but have not had much success. I am new toGHC's build system and would be grateful for some advice.I'm trying to build the 6.4.2 source distribution on an x86 linuxmachine, using GHC 6.4.2 (the "Generic Linux with glibc2.3" version onthe download page.) The target is the same machine. I've created a mk/build.mk file containing just the line:GhcUnregisterised = YESwhich, according to the comment in mk/config.mk, ought to build theunregisterised libraries that I'm after (and use them by default.)I run configure as follows:./configure --prefix="$HOME"/ghc_uand then simply "make".After some time, the build fails with the following:../../ghc/compiler/ghc-inplace -H16m -O -fglasgow-exts -cpp -Iinclude-"#include" HsBase.h -funbox-strict-fields -ignore-package base -O Rghc-timing -fgenerics -fgenerics -split-objs  -c GHC/Base.lhs -oGHC/Base.o -ohi GHC/Base.hi/tmp/ghc4237.s: Assembler messages:/tmp/ghc4237.s:17: Error: symbol `__stg_split_marker' is alreadydefined/tmp/ghc4237.s:29: Error: symbol `__stg_split_marker' is alreadydefined/tmp/ghc4237.s:41: Error: symbol `__stg_split_marker' is alreadydefined/tmp/ghc4237.s:53: Error: symbol `__stg_split_marker' is alreadydefined This goes on for a while ...ghc: 124912780 bytes, 12 GCs, 808164/1513632 avg/max bytes residency(2 samples), 19M in use, 0.00 INIT (0.00 elapsed), 0.59 MUT (2.39elapsed), 0.10 GC (0.09 elapsed) :ghcmake[2]: *** [GHC/Base.o] Error 1make[1]: *** [all] Error 1make[1]: Leaving directory`/mnt/raid/home/jeremyrw/src/src/ghc-6.4.2/libraries'make: *** [build] Error 1I've also tried with the following build.mk(I was guessing the -fvia-C might avoid the above assembler problem.):GhcUnregisterised = YESGhcLibHcOpts = -O -fvia-Cbut it fails in the same way.I'm not sure what to do at this point. Am I missing something in the build.mk?Is there likely to be anything else that needs to be tweaked?Has anybody had any success with this sort of thing before?___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Hello Thorkil, I am very sorry for the late reply. I have been extremely busy and I wanted to give you a coherent answer. For a brief overview of the speed of the libraries I looked at carefully, see http://hackage.haskell.org/trac/ghc/wiki/ReplacingGMPNotes (I added a few charts to show the speed of ARPREC, OpenSSL's BN, GMP and LibTomMath. I did not add speed tests for Crypto++ and Botan because they don't measure up. The original timings I obtained for them were based on their own benchmarks which are inadequate and (for Crypto++) based on tuned assembly code only available on Pentium4 processors with SSE.) I tested GMP and OpenSSL's BN for reference. Over the past few weeks I tore Crypto++ apart and modified a few things, only to find out that it has a persistent bug: woven throughout the library is a conversion from 32-bit to 64-bit ints using unions. This kind of transformation breaks the C's (and C++'s) aliasing rules (thanks to John Skaller for pointing out the problem), so compiling Crypto++ with optimisations turned on (g++ -O3) introduces failures, especially in the Division algorithms. I could change the unions to bitwise transformations with masks but I would have to really dig out all the references. After some more rigorous timing tests I found that I would have to essentially rewrite the algorithms anyway. What a mess. After some more research I found that there really are no other good public domain or BSD-compatible licensed libraries available. I tested two other free arbitrary precision integer libraries, MPI and MAPM, but they were also too slow, sometimes as much as 4 times slower. MAPM uses a Fast Fourier Transform (FFT) from Takuya Ooura (see http://momonga.t.u-tokyo.ac.jp/~ooura/fft.html) and should have been fast but turned out to be even slower than MPI. If you look at the ReplacingGMPNotes page I mentioned at the top of this email, the charts show that LibTomMath is weak in multiplication--at larger integer sizes (2048-4096 bits) it is half as fast as GMP, or worse. On the other hand ARPREC, which also uses a FFT algorithm, is slow at lower precisions (256-512 bits) for two reasons: (1) at relatively low precisions, ARPREC defaults to faster standard algorithms instead of its FFT and (2) when using its fast FFT at medium levels of precision (512) bits the FFT is too ponderous keep up with the relatively lighter and faster algorithms of the int-based libraries (GMP, OpenSSL's BN and LibTomMath). (As a little history, ARPREC used to be fairly bad relative to other FFT programs available but was redone after 2003 so by 2004 it was fairly good, it is up to version 2.1.94 now and it is much better; if you are looking at ARPREC benchmarks online prior to 2003, they are too old to be good indicators of its present capability.) I keep talking about ARPREC--why? For three reasons: (1) I trust its level of precision--this has been tested, see FFTW's benchmark page for accuracy: http://www.fftw.org/accuracy/G4-1.06GHz- gcc4/ (2) if you look at the charts, although ARPREC is bad in division it simply blows GMP, OpenSSL and LibTomMath away: at 4096 bits (85 doubles--each double has conservatively only 48.3 or so bits of integer precision), ARPREC can take a full random number to pow(n,7) in .98 seconds, compared to 77.61 or so seconds for the leader of the Integer-based libraries, GMP. (I ran the tests many times to make sure readings weren't flukes.) (3) of all the FFT-based arbitrary precision libraries available, ARPREC is the only BSD-licensed one--Takuya Ooura's library (used in MAPM) is only a FFT algorithm and not necessarily either fast or accurate. The rest of the FFT libraries available are essentially GMP-licensed. So I am in an unenviable position: I intend to fulfill my promise and get a replacement for GHC (and maybe more), but I have to essentially build better functionality into the front-runner, ARPREC. At present I have been working with vector-based algorithms that would enable me to use hardware-tuned code for Single Instruction Multiple Data (SIMD) chipsets. Currently I am researching algorithms based on operating-system supplied vector libraries. Part of this modification involves a fast transformation between a vector of large integers and an array of doubles, without loss of precision (although vectors of doubles are also working well, they do not have the same library support.) I am also working on enhancing ARPREC's division algorithm. This is the problem I face: GHC unfortunately does not use Integer as a mathematical operation but as a primitive type, complete with bitwise operations. From my experience, GHC users typically use Integers at lower precisions (typically under 256 bits) and they do not use Integer for higher math. (Integer math operations are basic primitives, as you already know.) I
Re: Packages in GHC 6.6
Simon, ... At the moment, the only packages you can add in this way are: ALUT, HGL, HUnit, OpenAL, OpenGL, QuickCheck, X11, cgi, fgl, haskell-src, html, mtl, network, parsec, time, xhtml ... instead include smaller and more fundamental packages: ByteString, regexps, Collections, Edisson, Filepath, Time, networking, web. I agree with your original idea and Bulat's: the libraries packaged with a compiler should, as a general rule, be those that support the core features of the language or usage of the language itself. Not far off from that are things like being able to work with the operating system--these are essentially related to I/O--because without them your machinations would only be able to operate on themselves. Without getting philosophical about what is and is not necessary for a language to speak, I'm just looking at the general big-libraries packaged with mature languages like Ada and C++. So here is my vote: Top priority: those you already mentioned, with Bulat's vote for Edison, ByteString, Filepath and Time; I vote for fgl (in my experience fgl is as useful in its own way as Data.Map), mtl, and haskell-src. So the core list might be: *base, haskell98, template-haskell, stm, mtl, haskell-src, *ByteString, Edison, fgl --only because for Edison and fgl there are no reasonable alternatives, so these are in a sense standard *readline, unix, Win32, Filepath and Time (they really complement unix and Win32) *Cabal--though this is an extraordinary convenience, it is not strictly necessary (Haskell-GHC users could always be forced to Include everything outside of a standard system directory. I don't mean a GHC-system directory--there are far too many language-specific directory structures crawling around, it's like trying to hide from Wal-Mart.) ... well, that's how I had started the email, but I think your original idea is right: stick with only what you absolutely need and with what is part of the Haskell standard library: *base, haskell98, template-haskell, stm, mtl, haskell-src and Cabal are fine; the rest can go together in a special distribution. Why add mtl and haskell-src? Those seem to be situations where you would have to install a specific library (mtl, say) and specially integrate it into your build of another. At some point the dependencies merit the adoption of a standard. Extra-Haskell parsing is a good example of leaving things out. HaXml's parser (HuttonMeijerWallace, PolyLazy) is roughly interchangeable with Parsec in many ways--though reasonable people may differ on which is actually better--and installing or uninstalling either would be easy. Things that are missing... There are some library systems that really should come standard and therefore need development. The most specific I can think of are deep-core Haskell debugging libraries (would you think of grabbing gcc without gdb?)--a GHC HAT. Things that allow specialised code such as haskell-src-exts's dynamic loading modules would also be good--this is like having dynamic libraries for Haskell. Sorry for the length; I've been literally bugged-out with getting a commercially and GPL-usable comparable replacement for GMP going... Personally I would love to give GHC a built-in interface to a real high-level math library (vectors, arbitrary-precision Reals?) but that would be as bad as forcing all gcc users to uninstall a hypothetical gcc-BLAS before they could install their own system- tuned or more precise library. I have been realising more and more that Integers are not a math-library but a data-type (hence bitwise operations). -Pete ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[4]: Packages in GHC 6.6
Hi Bulat, sorry, but you miscitated me and seems to misinterpret whole idea: Sorry about that. I put the emphasis on your mention of fundamental, almost to the exclusion of compiler-builds. My point was that there are two design considerations when you are designing a compiler system: (1) the compiler itself (which practically everyone agrees should optimally be stand-alone); and (2) the core language libraries. Domain-specific languages like Mozart, research languages or languages with only one compiler, such as Clean, and languages which lack good FFI support tend to be complete language packages. Relatively mature languages such as Fortran, C, C++, Ada and ML (smlnj, MLTon) have standard library systems; I think Haskell is moving in that direction. The library systems in Haskell have gone beyond the Haskell98 standard of core functionality with extensions, such as GHC- specific code (and libraries integrated with it), TH, MTL and Arrows. What is standard is more properly a matter for Haskell- prime but may (and has) been implemented into Haskell compiler systems, especially the biggies, GHC and nhc98. As compilers become more plentiful organisations and even individuals may move away from the Microsoft/Borland/CodeWarrior core distributions and introduce their own separate compiler for a language, such as Intel, Sun and IBM have for C, C++ and Fortran; Comeau and Digital Mars for C ++; and many others. Haskell hasn't gotten that far yet--once JHC and Yhc are production ready they might fill that position. 2. For windows-like OSes where users prefer to see larger monolithic installations we should include more libraries in standard distribution. I seriously believe the reason for standard distributions on Windows is the extreme difficulty of getting things to build correctly and work together. Once you have reached that beautiful point where almost everything is balanced and relatively stable--Quick! package it before it breaks again! Package distributions for OS X and mostly- GUI Linux distributions are a convenience; they aren't practically necessary as they are with Windows. Imagine trying to tag GHC distributions to a capricious system like MinGW--which may have older versions of gcc installed. * data structures and algorithms (highest priority) * web/networking * interfacing with OS such as file operations (lowest priority) --web/networking? When I first wrote that email last night I agreed with you that including web and networking tools would be good, as a kind of uniform-interface to varying low-level systems libraries, but these are the kinds of libraries that are easily installed separately from a distribution and may be volatile enough to merit separate update cycles. For HTML, XML and cgi-networking in particular there are several stand-alone, stable libraries too choose from. i think that this are the kinds of libraries most widely used. ... and useless graphics libs will allow to cut down GHC installer by about 20%, while selling of basic ghc without these libraries will not by for us any more size reduction I agree. Utility is, however, a very relative term. Core language systems such as general Haskell-tuned data structures is not because at some point all programs need them to function. 3. We can also create larger installer which includes libraries and tools that also highly popular but too big to sell them to all who want to download ghc. i mean at first place graphics, databases and RAD tools. concrete, wxHaskell, gtk2hs, db libs, VisualHaskell and EclipseFP Separately maintained, of course; that would give some freedom to Porters :) and last - all the libraries installed except for core ones will be easily upgradable without upgrading ghc... boost their development. That is the hope, I guess. The unfortunate problem with Haskell is rampant, rapid bit-rot: some programs written or maintained into 2003---only three years ago!--are already outdated. GreenCard is a prime example of this. My point in emphasising a somewhat standard compiler-and-core-libraries setup was to encourage widespread support and maintenance of new-standard libraries and to ensure that by forcing the compiler to build with them, they would not be left behind. -Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: OpenSSL License (was Replacement for GMP: Update)
John, Have you carefully investigated the OpenSSL license? We in Debian have had repeated problems since the OpenSSL license is, as written, incompatible with the GPL (even linking to OpenSSL is incompatible with the GPL). I would hate to have a situation where all GHC-compiled programs can't be under the GPL. I have been discussing this very issue with several people in this list. At first I thought it was a peripheral issue because I had not carefully reviewed the GPL license until Florian Weimer pointed out what the problem was. By that time I had already decided to work with another replacement library--I have not yet decided which because I am still testing things out--based on good arguments other people gave for not subjecting users to OpenSSL's advertising clause. The libraries I am working with are either public domain or BSD licenses. Sorry to scare you. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Florian, This is the offending part: * 3. All advertising materials mentioning features or use of this software *must display the following acknowledgement: *This product includes cryptographic software written by * Eric Young ([EMAIL PROTECTED]) *The word 'cryptographic' can be left out if the rouines from the library *being used are not cryptographic related :-). It's generally believed that this is a further restriction in the sense of section 6 of the GPL (version 2). In any case, I think it would be more of a restriction to someone *using* the OpenSSL program, not a developer. It's a problem for a developer who wants to use a GPLed library written by someone else, too. Quite right; my mistake: under the OpenSSL license a developer cannot mention features of the software in advertising materials, so the license grant of the GPL-OpenSSL program to the developer is void. The reason I mentioned users only was that in the particular problem we have here GHC does not use any other GPL programs (I think I am correct--readline is the unix version, not the GPL version, correct?) so until the developer compiles a Haskell program with GHC (with OpenSSL) *and* that program uses a GPL program, the Haskell developer is still able transfer a valid license to users. The way the OpenSSL FAQ stated the problem, the implication was that there was specific mention of OpenSSL in a GPL license. The advertising requirement in the OpenSSL license would certainly constitute a further restriction under GPLv2 section 6; the strange implication is that the no further restriction clause is so broad the same clause (verbatim) in section 10 of the LGPL means the GPL license is incompatible with the terms of the LGPL! It's all very touchy. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Replacement for GMP: Update
Simon PJ and Bulat, [the] ForeignPtr solution [has] gotten a lot cheaper in GHC 6.6 than it used to be, so it's worth trying. A merit of the approach is that is avoids fiddling with the bignum allocator at all. I actually did not know that until today; I have tried to keep up with the rapid changes going on but until Simon Marlow posted the FFI syntax patch on cvs-ghc-request I had not read into it that much. It won't be too much trouble for me to do a bare FFI binding to GMP or another library (people seem to be having problems with OpenSSL's license) but what I have been doing still applies: writing bitwise operators and cleaning things up. I don't know how much the indirection of a FFI binding would degrade the speed compared to a direct C-- binding (you have an extra function call with FFI); it should not be any more costly than indirection through two pointers. This will be quick: I will set up a GMP-FFI binding as a speed- reference, for starters. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Brian, Therefore I'd recommend that licenses for code used by GHC runtime should be either BSD or public domain. I agree. I was working on a rewrite of OpenSSL's BN from scratch-- maybe a rewrite of GMP would be better--but that is a huge undertaking for no other reason than these are big, complex projects. I will have to test Crypto++ and Botan to see if they are comparable in a Haskell context (both are written in C++; Crypto++ is essentially public domain while Botan has a BSD2 license--reproduce copyright in distributions of binary and source code). I will have to write least-common-multiple, bitwise operators and conversions to and from floating point representations. If the FFI was used for bignum then (talking about Windows OS for the moment) the bignum implementation could just be supplied as a C DLL, perhaps even several different C DLL's for people to choose which one they wanted to distribute with their program based on speed vs licencing issues. Eg if GMP was in a DLL then it would be sufficient to just supply gmp.dll + the gmp LGPL as two files along with the app binary and licensing issues would disappear afaiu. Another advantage of this direction would be that any slowness in the FFI would have to be ironed out, leading to a faster FFI which would be good for other things too eg matrix libs, graphics libs etc. Finally, separating bignum out of GHC runtime would make GHC runtime leaner therefore (hopefully)easier to maintain. I am testing two versions of GMP against the current internal version: one using FFI and another with the ForeignPtrs written in C--. If either is comparable to the internal version that is definitely a preferable solution for flexibility. I have to be very picky, though: Simon Marlow, Simon Peyton-Jones and the rest of the GHC Team are primarily interested in performance and the integrity of the RTS (no one would be happy if the RTS broke for bad FFI calls). Thanks for the encouragement. Best regards, Peter Tanski ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Replacement for GMP: Update
John, After all on the average call where an object of that size is free already it is a single array lookup, we have: (a) fetch pointer (one read) (b) fetch next (one read) (c) store next as current (one write) This is true for memory access; it is not true for memory allocation. I do not know how malloc allocates memory on Windows but on general POSIX systems the kernel uses a linked list and lots of other management things to reduce fragmentation, such KMEMSTAT. Malloc may also block, which is something that you have more control over in your own garbage collected heap. A really good explanation for the problem of rapidly allocating and deallocating temporary blocks of memory under 35kb is here: http://ridiculousfish.com/blog/ archives/2006/05/16/36/ . In any case, Simon Marlow had previously mentioned that alloc (from GHC's heap) is faster than malloc. He is almost certainly correct, although I hope the difference will not be that great and the only thing I have to worry about is ForeignPtr. We shall see whether malloc-memory makes a difference in the benchmarks. A purely functional system -- one which does NOT convert self tail calls into jumps and reuse storage -- can perhaps be faster, since each thread can have its own local arena to allocate from (without need of any write barrier) .. however it isn't clear to me what the trade off is between cache misses and parallelism. That is interesting but I do not understand whether your mention of self tail calls turned into jumps was low or high level. From the context it seems as if you are talking about a high level implementation; each function running in a separate thread. GHC's RTS does use many separate threads (the RTS is threaded by default for the latest version, 6.6). As for turning self tail calls into jumps at the low level, GHC does do this through C-- (the GHC implementation of C-- is called Cmm). I believe that is both faster and more memory efficient than a high level threaded system. Philosophically speaking, even if Simon Peyton-Jones developed Cmm to solve the problem of efficient functional computations, Cmm has turned the research of Haskell from research on a computer language to research on a system of computation. (Maybe that is what he meant when some time ago he wrote John Meacham and said that they (the GHC researchers) considered compiling via C a dead end.) A language can be anything: all it requires is a means of being translated into machine code; pseudo-intelligent and advanced compiler systems such as gcc, JHC for Haskell or OCaml for the Caml version of ML, may translate programming languages into machine code but the underlying computation remains largely sequential. The curious thing about GHC- Haskell is that through the prism of Cmm, which enforces such things as immutable variables and recursion right at the machine level, Haskell is less a language of translation to sequential machine code and more a description of a computational model. If you still think I am wrong about this, consider the possibility that Haskell with Cmm is a modern research project in the same concept that motivated Lisp: a different model of computation. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Einar, In my previous email I wrote something potentially confusing (really a typo): For developers (commercial or open source), the OpenSSL license only mentions redistribution of the OpenSSL code in binary form (paragraph 2). In this context binary form means the complete program binary, not partial binary as with statically linking to a library, so developers of GHC programs would *not* have to include the whole OpenSSLSSLeay license in their source code. I meant in their code, not source code, source or binary. I hope that helps. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Einar, *This product includes software developed by the OpenSSL Project *for use in the OpenSSL Toolkit (http://www.openssl.org/). All developers would have to do is include the acknowledgment stated above. I think this is not bad for specific applications, but forcing this upon all code compiled by GHC would be bad. I think the compiler should not link applications by default to things that force license related things. I think this is one reason GMP is being replaced. ps. personally I don't think the advertising clause is bad, but I think it is bad to force it on other users. You may be right. The licensing problem with GHC, as I understood it, is summed up at http://hackage.haskell.org/trac/ghc/wiki/ ReplacingGMPNotes. LGPL is very restrictive. As I have been working on separating BN out of the main OpenSSL distribution, renaming symbols and generally reforming it into a custom, stand-alone library for GHC I could take it one step further and implement it from scratch as a GHC library. Implementing the BN library from scratch may take some time but I will give it a shot and see if I can't get better benchmarks. The downside is that I would have more incentive to remove some Cryptography-based cruft, such as BN_nnmod, BN_mod_add, BN_mod_sub and the BN-random routines, as these are unnecessary for Prelude and GHC. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP: Update
Reilly, ... this shouldn't prohibit linking GMP in dynamically, should it? It's just a C library and GCC should happily generate relocatable code. As a dynamically linked library, there should be no tainting issues to worry about even if the dynamically linked code is shipped with the executable. Am I missing something? Not at all. GHC builds the GMP library only if it is not already available on the system. On my Mac OS X computer it uses a dynamic library. I have not tried using gmp.dll on Windows since I have not built GHC on my Windows computer (it is a bit slow--a 600MHz P3). But the dynamic library form of GMP only solves the licensing problem (admittedly, for my purposes the worst of the bunch). It should be easy to change GMP's build settings so GHC is distributed with a dynamic GMP library. The other problem is that GMP has a mechanism to let the user determine its memory allocator, with the caveat that only one allocator can be used by a single program. GHC configures GMP to use GHC's RTS-GC for allocation so GHC-compiled programs can't interface with GHC separately. (This would not be such a big problem for general programs but C-Haskell cryptographic or scientific programs that might benefit from GMP's additional functionality would suffer.) On a side note, if you have been reading this user-list recently it seems that programmers (including myself, I guess) do not want to have to package a dynamic library (GMP) with programs compiled with GHC--a particularly irksome task if your Haskell program doesn't even *use* Integer. Not only do users have to package the separate dll, they also have to package a notice of the GMP copyright along with the binary. Just today Einar Karttunen mentioned that: *This product includes software developed by the OpenSSL Project *for use in the OpenSSL Toolkit (http://www.openssl.org/). All developers would have to do is include the acknowledgment stated above. ... ps. personally I don't think the advertising clause is bad, but I think it is bad to force it on other users. Einar does have a good point, here. Personally speaking, such packaging and licensing stuff is o.k. for free software but for a clean commercial distribution it would be a bad thing; a reason to choose not to use GHC (or nhc98, for that matter). Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Replacement for GMP: Update
Simon PJ, Simon, Esa and John, Here is an update on what I have been doing so far in making a grand attempt to replace GMP. (1) evaluate replacement libraries LibTomMath: Pros- * has all the operators GMP offered Cons- * slow; about half as fast as OpenSSL in my tests for simple mathematical operations, much slower if you account for time to write or resize memory. (There is another MP library, which LibTomMath is based on, that is also very slow--student work) * not even reentrant; needs significant modification * beta-release; needs a bit of work to get it to production level * configuration needs to be written (not really portable, messy) ARPREC: Pros- * very fast (essentially does everything through the Floating Point Unit of a CPU) * complex mathematical operations * very precise * already thread safe (through C++ thread-safe statics) Cons- * no bitwise operations (not even left and right-shifts) * difficult configuration (everything runs by setting a precision level; (precision level ~= number of words (doubles) in array) it does not automatically resize memory; conversion from MP Real to Integer relies specially on careful precision-level) * memory inefficient (underestimates the number of real digits you can fit into a double, i.e., a 64-bit double has 48 bits of precision, holding about 9.6 digits per byte, resulting in an 848-byte array to hold an MP number with 1000 digits). OpenSSL: Crypto++ (http://www.eskimo.com/~weidai/cryptlib.html): Botan (http://botan.randombit.net/): Pros- * all of these are fast, since all use Integers to support cryptography; (Crypto++ and Botan are C++ crypto-libraries, all licenses good) * all of these provide most basic mathematical operations Cons- * none of these provide , |, ^(xor) bitwise operators * Botan has least mathematical operators of the three * none provide lcm operator * all would realistically have to have the Integer libraries stripped out of the distribution and repackaged for GHC Summary: I finally settled on modifying OpenSSL, since that would be the easiest to work with under GHC's hood (plain C code, not C++). I have to: a. make OpenSSL's BN thread-safe (add thread local storage, at least) b. optimally add configure-conditional parallelism to BN (through PVM) c. add , |, ^, lcm and a few other operations d. separate the BN from the rest of OpenSSL and rename the symbols to avoid conflicts (necessary because I have to modify the library anyway) (2) work on GHC: * finally understand C--; know what I need to modify * work through Makefiles: touch and go; I haven't mapped out all the variable settings from configure.in on down when it comes to DLLs Comment: for the Makefile in ghc/rts, in lines 300-346, GC_HC_OPTS += -optc-O3 --isn't this problematic? gcc, from -O2 on includes -fgcse which may *reduce* runtime performance in programs using computed gotos; -fgcse is actually run twice, because -frerun-cse-after-loop is also set at -O2. Would it be better to pass individual flags, such as -funroll-loops and -falign-loops=16 (ppc, Intel setting)? (3) I have been looking at how to implement a dual-constructor-in-a- pointer for Integer (i.e., merge constructors of small Integers and big Integers into the Int#). Would that solution be workable or might it break current Haskell programs? Just a thought. -Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Replacement for GMP
Hi Bulat, don't forget about speed/memory efficiency of any programs that use Integer just for case but really most of their numbers fit in 32/64 bits. i have one particular program of this type - it builds list of all files on disk and Integers are used to save filesizes. i will be glad if, vice versa, memory requirements for small integers will be reduced to the same as for Ints I was looking at this yesterday, partly because I read your previous discussion in returning to cost of Integer. The low-level solution Lennart mentioned and that you noted is used in OCaml would be fast and convenient for a programmer. That solution would be much more difficult in C--, however, since it requires customisations for different processors or operating systems, especially 32 and 64bit architectures. Following the general trend of consensus, it should be part of the Bignum library; if the library does not have such a test it, as OpenSSL's BN library does not, it would have to be added. (With unmodified OpenSSL, you would have to examine the When the Bignum library returned the value to the RTS, the RTS would only have to check for the tag and store it accordingly. the same binary that also wants to use GMP. (Of course, we could *copy* GMP, changing all the function names. That would eliminate the problem!) isn't it rather easy task for some automated tool? i think that even existing tools may be found I know copyrights are weak compared to patents but I do not think software copyrights are that weak. Just changing the names seems like a cosmetic change and performing the change through an automated system, doubly so. Copyrighted programs--particularly under the GPL license, which also covers the resulting object code--do not lose their copyright protection through name-mangling performed by a preprocessor. I think the lawyers for a company using GHC would probably be worried about it. Best Regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP
Simon, (1) We'd be delighted to use a BSD-licensed alternative to GMP in GHC. It's been a long-standing issue, just never quite important enough to get done. If either or both of you are willing to put in the legwork, and emerge with an implementation that we understand and can maintain, we'd be happy to use it. We'll certainly help in any way we can. I shouldn't speak for Esa, who kindly offered advice if I run into trouble. More than a few people seem interested in this, so maybe it will be a bunch of use can carry it off. (2) We're concerned about performance. Replacing GMP, but losing substantial performance on bignum-intensive programs would be unattractive. Definitely. You had mentioned OpenSSL as a possible replacement (at least it contains all the currently implemented Prelude functions over Integer except for lcm and integral powers). I had mentioned ARPREC and Esa had cautioned against attempting to integrate with a C+ + library. LibToMath covers everything and Esa had worked on that before. Would you be able to suggest another possibility? (3) It's unlikely (albeit not impossible) that we'll get GMP-level performance out of a Haskell-only bignum library. ... this would be a step backwards! It certainly would. As noted below I have been searching for high performance options because a Bignum library for a builtin-type such as Integer should optimally perform like an embedded system. (4) The tricky spot for any library is memory allocation. Our GMP- based implementation works by getting GMP to use GHC's allocator to allocate memory. This means that every bignum is allocated in the Haskell heap, is automatically managed by GHC's garbage collector, which is Very Good. How would lib-based memory allocation affect concurrency and parallelism? I mentioned to Esa that the library should be threaded, not merely thread-safe or reentrant. (OpenSSL is reentrant, through the CTX system, I believe.) But because the allocator is statically linked to GMP, you can only have one allocator, and that leads to difficulties if you have another bit of the same binary that also wants to use GMP. That seems to be the second main reason to replace GMP. I could modify the Bignum library to be multi-threaded, with a separate thread system tied to RTS-memory. Would that be workable? (Of course, we could *copy* GMP, changing all the function names. That would eliminate the problem!) In an email I sent to Bulat Ziganshin, I noted that such a fix would be legally worrisome: it seems like a cosmetic change, like changing the titles of chapters in a book. At the least I think it would worry the lawyers of any company using GHC to produce commercial products. I suppose that one alternative is to let the library use 'malloc', but make a foreign-pointer proxy for every bignum, which calls 'free' when the GHC garbage collector frees it. Not as efficient, though. Esa and I had discussed the possibility of copying the value returned from the Bignum lib into the GHC system, which certainly would not be very memory efficient, but might be faster. Among other memory modifications, it might be a good idea to initialise the Bignum lib with the RTS and modify the lib with a memory cache or garbage collection system of its own. (5) If you do go ahead, could you pls start a Wiki page on the GHC development Wiki (http://hackage.haskell.org/trac/ghc), where you document your thoughts, the evolving design etc? You might want to extract the core of this email thread to initialise the Wiki page. I got the page started on a document here already. I will have the rest up very soon. On a related note, I am working through the Makefiles for the RTS; I could clean things up a bit while I am at it. One of the big goals seemed to be to move to Cabal--or was that just for the Haskell libraries? I had mentioned this to Esa: would you be interested in moving to a higher level configuration system for heterogeneous program parts, such as Bakefile or Scons? Bakefile, at least, would result in the current make-based install but would be more easily maintainable and would allow variations for different compilers. Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[4]: Replacement for GMP
Hi Bulat, the same binary that also wants to use GMP. (Of course, we could *copy* GMP, changing all the function names. That would eliminate the problem!) isn't it rather easy task for some automated tool? i think that even existing tools may be found I know copyrights are weak compared to patents but I do not think i proposed this is as solution for technical problem (inability to use GMP in ghc-compiled programs due to name reuse and inability to use ghc-specific GMP in user code), not as the way to avoid copyright problems :) Ahem... my apologies for taking it out of context. I took your comment (and Simon's) as copyright-related because I thought part of the problem with a single binary where foreign code used GHC's GMP was due to the integration of GMP's memory with the GHC RTS. Indeed it shouldn't be too difficult to change the names of the functions in GMP. -Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Replacement for GMP
Esa, In the July thread, (Repost) Replacement for GMP as Bignum: ARPREC? Haskell?; OS X and OpenSSL, you wrote: In past, I tried to get rid of GMP by replacing it with libtommath http://math.libtomcrypt.com/ But I have given up for now - because of related and unrelated problems. Since I had no prior experience with LibTomMath I decided to take a look at it. The most recent release version of LibTomMath seems to be 0.39. Were some of the related problems you ran into due to bugs in an earlier version of LibTomMath? Maybe it is premature for me to go looking at replacing Integer when GHC has already moved so far ahead of itself I have to service the source code just to get it to build on OS X, but I figure that while I have the hood up I might as well take a look at the rest of the engine... If you have any old notes on the problems you encountered, I would greatly appreciate it. -Pete Tanski ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Replacement for GMP
Esa, What I have written here might not be the most useful guide to start with, but maybe it is of help for other interested souls. Many thanks for the notes; it would probably be better if more than one programmer worked on it. * The memory handling: The idea on most bignum libs is that they have c-structure kinda like this: struct Big { word size, used; digits* payload; bool sign; } ; Now, the size and used tell how much memory is allocated for payload and how much of it is used. sign is the sign (plus/minus). payload is a pointer to memory that contains Integer decoded. ... Before we ... call math-lib, we put together a temporary structure with correct pointers. As for target variable, we have hooked the mathlibs memory allocation functions to allocate correctly. Upon returning Integer, we just take payload, write sign on correct place and return the payload-pointer (possibly adjusted). In pseudo C digits* add(digits* din) { Big in, out; in.size=getLength(din); in.used=getLength(din); in.payload=din; in.sign=getSign(din); math_lib_init(out); math_lib_add(out, in); writeSign(out.payload, out.sign); return out.payload; } Sorry to take more of your time, but what do you mean by allocate correctly? (This may sound naieve): the in { size, used, payload, sign } are all parts of the info-table for the payload and the RTS re-initialises the mathlib on each invocation, right? In the thread returning to cost of Integer, John Meacham wrote: we could use the standard GMP that comes with the system since ForeignPtr will take care of GCing Integers itself. From current discussion: at present the allocation is done on the GC heap, when it could be done entirely by the mathlib. The benefit to letting the mathlib handle memory would be that you could use the mathlib with another (non-Haskell) part of your program at the same time (see, e.g., (bug) Ticket #311). (I am making an educated guess, here.) You probably chose to allocate GMP's memory on the GC heap because: (1) call-outs to another program are inherently impure since the type- system and execution order are not defined by the Haskell Runtime; and, (2) it was a stable way to ensure that the allocated memory would remain available to the thunk for lazy evaluation, i.e., so that the evaluation of the returned Bignum could be postponed indefinitely, correct? Or could the evaluation itself be postponed until the value was called for--making operations on Integers and other Bignums lazy? In other words, it does not seem possible to simply hold a ForeignPtr to the returned value unless there were a way to release the memory when it was no longer needed. If you wanted the mathlib to retain the value on behalf of GHC, you would have to modify the library itself. In the end you have a specialized version of the library and a change to the procedure from: math_lib_init ; ... return out.payload ; to: math_lib_init ; math_lib_evaluate ; math_lib_free ; An easier though less-efficient alternative would be to have GHC copy the value returned by the mathlib. That would be stable and allow other systems to use the same mathlib concurrently (assuming the lib is thread-safe). The third alternative I suggested previously was to embed the Bignum processing in GHC itself. I think it would be very difficult to maintain a solution that was both optimised and portable, at least in C--. (I may be way-off here; I am simply going by a rudimentary knowledge of BLAST implementations.) If I am correct about (2) above, the best conclusion I could draw from this is that the easiest solution would be to copy the memory on return from the mathlib. There are tricky parts for 64bit-stuff in 32bit systems and some floating point decoding uses bad configure-stuff that depends on math lib stuff, but mostly it's very boring hundreds of lines of C-- (but you can make the job much easier by using preprocessor). I was reading through the Makefiles and headers in ghc's main include directory about this. One of the big ToDo's seems to be to correct the method of configuring this stuff using machdep.h or the equivalent on a local system, such as the sysctl-headers on Darwin. For C-- this seems like it would be a bit more difficult than simply confirming whether (or how) the C implementation conforms to the current standard through the usual header system. This is why APPREC might be hard - you need to know the internal representation. GHC C-- unfortunately is not really near the C-- spec, it doesn't first of all implement it all - but that doesn't matter for this task - and then it has some extensions for casting and structure reading, I think. These are really great suggestions. GHC's codes (including the .cmm files) seem
Re: Replacement for GMP
Hey Esa, Another great instructive email! Thanks again! I will keep this response short because I am sure you are busy and you have been more than helpful so far. I also need to get back to working through the code... I hope my answer helps, but if it gets you more confused, maybe it's just because I'm confused... No, you are just trying to understand what I am saying and since I am new to GHC's rts internals I do not yet have the knowledge to express my thoughts well. There are other nicer things about that as well - untying Integer (atleast mostly) from runtime/frontend and moving it more into domain of libraries. snip Another program? I assume you meant outside pure haskell - call- outs have side-effects. We can get around by using unsafePerformIO, which doesn't really differ that much from writing it in C-- (and library in C), except we'd write haskell. If the program (written in C--, C, C++, whatever) and the interface from Haskell to that program were well-typed it would not be any different than writing the entire program in Haskell, but the order of execution must remain in sync with the Haskell program. If the rts is threaded or parallel you might imagine problems cropping up. In this case I have to evaluate whether, say, OpenSSL's BN library is threaded (or, for Parallel Haskell, also working through PVM), not merely thread safe and certainly not merely reentrant. Uhm, naturally, haskell rts needs to control lifetime of the memory. I am not sure what you're trying to say here, really. Is the point that we cannot free almost anything without permission from garbage collector? Because, yeah, we can't. My problem was not with the garbage collector but with what the garbage collector depends on: when an object is evaluated and no longer in scope. I am not insane enough to attempt writing high level mathematical operations such as pow() or sqrt() as primitives in an integrated Bignum implementation, but you might be able to imagine that at that level it would be possible to choose when to save and when to evaluate parts of a long equation. As I understand, you suggest here copying payload instead of merging memory handling. I don't think it's clearly, if ever, less efficient than ForeignPtr-based approach. But I'd guess it is *more* code than current solution. Excellent point. The third alternative I suggested previously was to embed the Bignum processing in GHC itself. I think it would be very difficult to maintain a solution that was both optimised and portable, at least in C--. (I may be way-off here; I am simply going by a rudimentary knowledge of BLAST implementations.) I don't think it differs much from doing the same in C. It does seem shame to write bignum in C--, as we don't get many elegance-style advantages from writing it in C-- instead of C. cut and paste from below As for what it has to do with topic at hand, I have no idea. C-- is simply used as an intermediate language for the compiler, for convience of calling conventions and such, some low-level operations are written in it. When I mentioned BLAS (not BLAST, sorry) implementations I meant that--as I understand it--some may contain hand-optimised code written in assembler. Certainly I would personally prefer implementing something in C but as you noted C-- allows GHC more convenience. C-- may also allow GHC to manipulate fragments and produce native code in ways that may not be possible to express in C, including the Bignum implementation. I don't know whether GHC takes - fasm to this extent, that is, further than patching object code from a Bignum library, but that I think that is one of the long-term goals. ... One of the big ToDo's seems to be to correct the method of configuring this stuff using machdep.h or the equivalent on a local system, such as the sysctl-headers on Darwin. For C-- this seems like it would be a bit more difficult than simply confirming whether (or how) the C implementation conforms to the current standard through the usual header system. Neither C or C-- was meant to be used to detect what system can do. It is simply a byproduct, which autotools takes to the extreme. I meant that the autotools determine the correct configuration--the big ToDo--and that C or C-- code must be written to conform to the configuration that the autotools found. C would certainly be the easiest; C-- would mean reaching deep into the specs. C would also open the possibility for optimisations from compilers and system libraries that would not be available to C--. One more reason to use a separate Bignum library... Best regards, Peter ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: GHC runtime DLLs
Brian, Sorry, I smash out emails without thinking and forgot GHC is distributed with static archives in Windows. No more. Even if you build the GHC runtime library from source as DLLs you will run into another problem, as noted in the DLL-NOTES file (see http:// cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/DLL-NOTES? rev=1.1;content-type=text%2Fplain or the file compiler/DLL-NOTES in the distribution): some of the runtime system must be statically linked into your main executable in order to resolve all the symbols at compile-time. The only way around this is to treat those runtime libraries like plugins. A good reference is http:// edll.m2osw.com/ . I have honestly not used EDLL on GHC runtime libraries, but it should be possible. One of the main goals for the next major release of GHC is to make it Windows-native and use Microsoft's CL. I think that is another big project The disadvantage to free software is that it often feels like you are trying to build a car with spare parts; either you spend your time porting and fixing things yourself--an almost daily task, these days--or you wait for someone with more experience or time than you have to fix it for you (which may never happen, or may not happen the way you want it). The advantage to free software is that, like the Haskell language, you get to use some of the most advanced programming available. So here I am, trying to figure out what I can do to help GHC, since right now GHC is the only actively maintained, current Haskell compiler available. (In any case, nhc98 uses GMP as well, so even if you use nhc98 you will still have the DLL-NOTES problem to deal with.) Best Regards, Peter On Jul 30, 2006, at 12:33 PM, Brian Hulley wrote: [EMAIL PROTECTED] wrote: Brian, The standard method of skirting the LGPL restriction and saving your source code is to link dynamically in a separate step and then distribute your program along with the dynamically linked LGPL'd library. Compile with ghc -c (or with ghc -c -odir 'separate directory where you want to store the object files') and pass specific lines to the linker through gcc with -optc. Then link the object files for your program separately using ld and distribute the ghc runtime libraries you need to dynamically link along with your program. Some of these runtime libraries are big but on average libHSrts_dyn, libHSbase_dyn and libHSbase_cbits_dyn do the trick (I have needed cbits up for programs that use -ffi). Hi - I think the main problem here is that I'm using Windows, so there is no way to dynamically link with the runtime libraries - the GHC implementations available for Windows only produce statically linked executables. Perhaps Windows support was just an afterthought from the main development of GHC on Unix, but I think it's quite a serious nusiance that the GHC runtime incorporates LGPL'd components in the light of the absence of the facility to dynamically link with it on this platform. Regards, Brian. -- Logic empowers us and Love gives us purpose. Yet still phantoms restless for eras long past, congealed in the present in unthought forms, strive mightily unseen to destroy us. http://www.metamilk.com ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users