Re: GHC, Clang XCode 4.2
On Oct 12, 2011, at 5:26 AM, Simon Marlow wrote: On 11/10/2011 18:45, David Peixotto wrote: Ok, I have attached a set of patches to support building the GHC runtime with llvm-gcc. The patches are based off of commit 29a97fded4010bd01aa0a17945c84258e285d421 which was last Friday's HEAD. These patches are also available from my github repository on the llvm-gcc branch at git://github.com/dmpots/ghc.git There are three patches: 0001- Uses pthread_getspecific and pthread_setspecfic to access gct when the `llvm_CC_FLAVOR` CPP macro is set 0002- Modifies the configure scripts to set the `llvm_CC_FLAVOR` macro when compiling with an llvm based compile (either llvm-gcc or clang) 0003- Passes the gct variable as a parameter in the GC. This change is parameterized with CPP macros so that it is only in effect when compiling for an llvm-based compiler. The patches 0001 and 0002 provide the minimal support needed to build GHC with llvm-gcc. The 0003 patch is there to limit the performance hit we get by going through pthread functions to access the gct. I think the 0001 and 0002 patches should not be very controversial, but the 0003 patch is a more invasive change and perhaps Simon Marlow will want to clean it up before it is applied. Thanks, I'll take a look at these soon. Just a thought, but someone might want to write a blog post about how Apple's choice to move to llvm-gcc is imposing a performance penalty on us here, and get it up on Reddit. That would give the issue some publicity (they love that sort of thing on Reddit), and might result in some action. I'd be happy to proof read a blog post before publication. Some simple benchmarks would be needed - one option is to use GHC itself with the various combinations of compilers + RTS changes, or there are a small set of GC benchmarks in nofib/gc. Since I have the different versions already compiled I can do the benchmarking. I'll probably use GHC and fibon as the benchmarks since I know how to configure those the best. It shouldn't be too much work to write a discussion on the problems we are having so I'd be willing to write the post too. It will take a few days before I can put it all together, but I'll send a draft out when I have something passable. Cheers, Simon I ran a validate with the patches, and found one additional failure when going through an llvm-based compiler. There were no additional failures when using a gcc compiler even with my patches applied. The additional failure is the cgrun071 test which tests the popCnt primitives. I'm going to look into why that test fails, but I think the patches should be safe to apply as it would only show up when compiling with llvm-gcc, which is currently impossible without these patches. -David On Oct 7, 2011, at 10:30 AM, David Peixotto wrote: On Oct 6, 2011, at 7:32 AM, Simon Marlow wrote: On 05/10/2011 09:46, austin seipp wrote: There has been recent discussion on the Homebrew bug tracker concerning the upcoming XCode 4.2 release by Apple, which has apparently just gone GM (meaning they're going to make a real release on the app store Real Soon Now.) The primary concern is that XCode will no longer ship GCC 4.2 at all, it seems. XCode 4.0 4.1 merely set 'llvm-gcc' as the default compiler, and GHC's `configure` script was adjusted to find the `gcc-4.2` binary. If you have old XCode's installed, then you may have the binaries laying around, but I doubt they'll be on your $PATH, and anybody doing a fresh install is SOL. It seems Clang 3.0 will now be the default compiler, with llvm-gcc as a deprecated option, probably removed in XCode 4.3. It doesn't matter really, because both of them do not work with GHC, because of its use of either A) Global register variables of any kind, or B) the __thread storage modifier. David Peixotto did some work on this not too long ago as the issue of compiling with Clang was raised. His branches include changes which make the 'gct' variable use pthread_getspecific rather than __thread for TLS and then as an optimization use inline ASM to grab the value out of the variable, with an impact of about 9% it seems, but that's on nofib and I don't know if it was -threaded. He also included a version which passes the 'gct' around as a parameter to all GCC functions which is a bit uglier but may give some better performance I guess. (The discussion is from here IIRC.) I suppose the real perf killer here is probably -threaded code. https://github.com/dmpots/ghc Was there ever any decision on which route to take for this issue? The parameter passing solution looks quite uglier IMO but it may be necessary for performance. I'm happy to incorporate the parameter-passing changes if necessary. I think
Re: GHC, Clang XCode 4.2
On Oct 6, 2011, at 7:32 AM, Simon Marlow wrote: On 05/10/2011 09:46, austin seipp wrote: There has been recent discussion on the Homebrew bug tracker concerning the upcoming XCode 4.2 release by Apple, which has apparently just gone GM (meaning they're going to make a real release on the app store Real Soon Now.) The primary concern is that XCode will no longer ship GCC 4.2 at all, it seems. XCode 4.0 4.1 merely set 'llvm-gcc' as the default compiler, and GHC's `configure` script was adjusted to find the `gcc-4.2` binary. If you have old XCode's installed, then you may have the binaries laying around, but I doubt they'll be on your $PATH, and anybody doing a fresh install is SOL. It seems Clang 3.0 will now be the default compiler, with llvm-gcc as a deprecated option, probably removed in XCode 4.3. It doesn't matter really, because both of them do not work with GHC, because of its use of either A) Global register variables of any kind, or B) the __thread storage modifier. David Peixotto did some work on this not too long ago as the issue of compiling with Clang was raised. His branches include changes which make the 'gct' variable use pthread_getspecific rather than __thread for TLS and then as an optimization use inline ASM to grab the value out of the variable, with an impact of about 9% it seems, but that's on nofib and I don't know if it was -threaded. He also included a version which passes the 'gct' around as a parameter to all GCC functions which is a bit uglier but may give some better performance I guess. (The discussion is from here IIRC.) I suppose the real perf killer here is probably -threaded code. https://github.com/dmpots/ghc Was there ever any decision on which route to take for this issue? The parameter passing solution looks quite uglier IMO but it may be necessary for performance. I'm happy to incorporate the parameter-passing changes if necessary. I think it should only be important in the inner loop of the GC (scavenge_block/evacuate and the functions called from there). If someone sends me a working patch, I can clean it up as much as possible. I will take a look at getting the patches to work with GHC head. To support llvm-gcc we need two basic things: 1. autoconf support to detect when we are compiling with llvm-gcc 2. a work-around for tls in the gc My old patches take care of both 1 and 2. For #2 the easy approach is using pthread_getspecific which is a pretty small change. Passing gct as a parameter is much uglier more invasive change, but I had it working at some point. It could perhaps be made less ugly by only passing it around in the important functions. Supporting clang is going to require more changes. The last time I tried there were problems with the preprocessor that made it not work. I just tried again yesterday and ran into a problem generating makefile dependencies because apparently clang does not allow both the -MM and -MF flags (http://llvm.org/bugs/show_bug.cgi?id=8312). Note that some of the overhead measured by David is coming from using the LLVM backend to gcc instead of gcc's own backend. In my own measurements a few months ago I found LLVM generated slower (but smaller) code for the GC. Maybe this will change over time. It's also possible that the current GC code is tuned for gcc - at various times in the past I've gone through and tweaked the code to get good assembly out (much as we teak Haskell to get good Core :-). Cheers, Simon I'm just posting this here as a reminder as this is probably going to become a problem pretty quickly for anybody who uses Lion or modern XCode and also likes using GHC, so it should probably be sorted out. :) I'm still on SL using XCode 4 so it's not an issue for me, but it will be for any future Mac endeavors. Hopefully they get support for __thread or something equivalent soon, because nobody likes performance hits, but it doesn't seem like we have a choice. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: GHC 7.0.4 on Lion
I think the warnings are not a big concern. I silence both of them by adding -optl-Wl,-no_compact_unwind,-no_pie to my ghc options in /usr/bin/ghc. In 10.7 they changed the default linking options to create a PIE (position independent executable). To create a PIE you have to compile all code as position independent, which is the default option of GHC on mac os x. For performance reasons some code is compiled with absolute references (like the gmp library code in your example) so it cannot be used when creating a PIE. The advantage of a PIE executable is that it is more secure because the OS can load it at a random base address. I believe the compact unwind warning is related to the creation of unwind frames for error handling with exceptions in languages like C++. There are some more details in this trac ticket: http://hackage.haskell.org/trac/ghc/ticket/5019. I'm not sure what the advantage of the compact unwind is, but it sounds like it could make the executable smaller. -David On Jul 25, 2011, at 7:59 AM, Luca Ciciriello wrote: Hi All. I've installed on my Mac the new MacOS X 10.7 (Lion) with Xcode 4.1 Using ghc 7.0.4 (64-bit) on that system a get the following warnings in the linking phase: Linking hslint ... ld: warning: could not create compact unwind for _ffi_call_unix64: does not use RBP or RSP based frame ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not allowed in code signed PIE, but used in ___gmpn_modexact_1c_odd from /Library/Frameworks/GHC.framework/Versions/7.0.4-x86_64/usr/lib/ghc-7.0.4/integer-gmp-0.2.0.3/libHSinteger-gmp-0.2.0.3.a(mode1o.o). To fix this warning, don't compile with -mdynamic-no-pic or link with -Wl,-no_pie Is this something to worrying about? Thanks in advance for any answer. Luca ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: HEAD doesn't not build on OS X
I've attached a small patch that seems to fix the build. The problem was that the `dtraceSparkCounters` function was getting called even when we were not compiling for the threaded rts. The patch just moves the call inside the #ifdef for the threaded rts. The call had been placed outside the #ifdef for the THREADED_RTS symbol which caused a compile error since the spark_stats field of a capability is only available when compiling for the threaded rts. 0001-Only-call-dtraceSparkCounters-in-threaded-rts.patch Description: Binary data On Jul 20, 2011, at 1:29 PM, Johan Tibell wrote: Still broken, but now for some other reason: inplace/bin/ghc-stage1 -optc-Wall -optc-Wextra -optc-Wstrict-prototypes -optc-Wmissing-prototypes -optc-Wmissing-declarations -optc-Winline -optc-Waggregate-return -optc-Wpointer-arith -optc-Wmissing-noreturn -optc-Wnested-externs -optc-Wredundant-decls -optc-Iincludes -optc-Irts -optc-DCOMPILING_RTS -optc-fno-strict-aliasing -optc-fno-common -optc-Ilibffi/build/include -optc-DDTRACE -optc-fomit-frame-pointer -optc-DRtsWay=\rts_v\ -H64m -O -fasm -Iincludes -Irts -DCOMPILING_RTS -package-name rts -dcmm-lint -Ilibffi/build/include -DDTRACE -i -irts -irts/dist/build -irts/dist/build/autogen -Irts/dist/build -Irts/dist/build/autogen -optc-O2 -c rts/ClosureFlags.c -o rts/dist/build/ClosureFlags.o In file included from rts/Schedule.h:15, from rts/Capability.c:23:0: rts/Trace.h: In function ‘traceSparkCounters’: rts/Trace.h:516:0: error: ‘Capability’ has no member named ‘spark_stats’ rts/Trace.h:516:0: error: ‘Capability’ has no member named ‘spark_stats’ rts/Trace.h:516:0: error: ‘Capability’ has no member named ‘spark_stats’ rts/Trace.h:516:0: error: ‘Capability’ has no member named ‘spark_stats’ rts/Trace.h:516:0: error: ‘Capability’ has no member named ‘spark_stats’ rts/Trace.h:516:0: error: ‘Capability’ has no member named ‘spark_stats’ rts/Trace.h:516:0: warning: implicit declaration of function ‘sparkPoolSize’ rts/Trace.h:516:0: warning: nested extern declaration of ‘sparkPoolSize’ rts/Trace.h:516:0: error: ‘Capability’ has no member named ‘sparks’ make[1]: *** [rts/dist/build/Capability.o] Error 1 make[1]: *** Waiting for unfinished jobs make: *** [all] Error 2 On Tue, Jul 19, 2011 at 9:55 PM, Johan Tibell johan.tib...@gmail.com wrote: Duncan, could you please take a look. On Tue, Jul 19, 2011 at 9:51 PM, Johan Tibell johan.tib...@gmail.com wrote: I just unpulled all the new GHC event patches, starting with d77df1caad3a5f833aac9275938a0675e1ee6aac, and the build is chugging along. On Tue, Jul 19, 2011 at 8:38 PM, Johan Tibell johan.tib...@gmail.com wrote: While trying to build head (from a maintainer-clean tree) I get the following error: echo compiler_stage1_depfile_haskell_EXISTS = YES compiler/stage1/build/.depend-v.haskell.tmp for dir in compiler/stage1/build/./ compiler/stage1/build/Llvm/ compiler/stage1/build/LlvmCodeGen/ compiler/stage1/build/PPC/ compiler/stage1/build/RegAlloc/ compiler/stage1/build/RegAlloc/Graph/ compiler/stage1/build/RegAlloc/Linear/ compiler/stage1/build/RegAlloc/Linear/PPC/ compiler/stage1/build/RegAlloc/Linear/SPARC/ compiler/stage1/build/RegAlloc/Linear/X86/ compiler/stage1/build/SPARC/ compiler/stage1/build/SPARC/CodeGen/ compiler/stage1/build/Vectorise/ compiler/stage1/build/Vectorise/Builtins/ compiler/stage1/build/Vectorise/Monad/ compiler/stage1/build/Vectorise/Type/ compiler/stage1/build/Vectorise/Utils/ compiler/stage1/build/X86/; do if test ! -d $dir; then mkdir -p $dir; fi done grep -v ' : [a-zA-Z]:/' compiler/stage1/build/.depend-v.haskell.tmp compiler/stage1/build/.depend-v.haskell /usr/sbin/dtrace -Iincludes -Irts -Ilibffi/build/include -C -x cpppath=/usr/bin/gcc -h -o rts/dist/build/RtsProbes.h -s rts/RtsProbes.d . includes/HsFFI.h .. includes/ghcconfig.h ... includes/ghcautoconf.h ... includes/ghcplatform.h .. includes/stg/Types.h .. /usr/lib/gcc/i686-apple-darwin10/4.2.1/include/stdint.h .. /usr/lib/gcc/i686-apple-darwin10/4.2.1/include/float.h . includes/rts/EventLogFormat.h dtrace: failed to compile script rts/RtsProbes.d: line 71: parameter is already declared in probe input prototype: StgWord, parameter #6 make[1]: *** [rts/dist/build/RtsProbes.h] Error 1 make: *** [all] Error 2 ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Profile: zero total time
Does it make a difference if you use the threaded vs. non-threaded runtime? I'm seeing the odd behavior on Mac, but only for the single-threaed runtime. http://hackage.haskell.org/trac/ghc/ticket/5282#comment:8 On Jul 7, 2011, at 2:45 PM, Matthew Farkas-Dyck wrote: Sorry, I ought to have mentioned: $ uname -sr Linux 2.6.38 On 7 July 2011 14:03, Daniel Fischer daniel.is.fisc...@googlemail.com wrote: On Thursday 07 July 2011, 20:44:57, Matthew Farkas-Dyck wrote: I am trying to take a profile of a program, but when I run it, the total time (as given in the profiling report file) is zero! If you're on a Mac, it could be http://hackage.haskell.org/trac/ghc/ticket/5282 -- Matthew Farkas-Dyck ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Installation
Perhaps the /Users/MyUser/.ghc folder is causing your problem? On Apr 20, 2011, at 2:33 AM, Luca Ciciriello wrote: Hi All. I'm using GHC with MacOS X 10.6.7 (Xcode4). I've installed GHC 7.0.3 and the HackageDB package hsgsom. Then, for my motivation, I've uninstalled GHC. To remove GHC I've used the uninstaller tool and I've manually removed the folder /Library/Frameworks/GHC.framework. I've also manually removed the folder /Users/MyUser/Library/Haskell and the folder /Users/MyUser/.cabal Now I've installed GHC again, but when I try to install the package hsgsom cabal tells to me that the package is already installed and I have to use the --reinstall flag. So, where the information of the installed packages are stored on my system? How can I remove all Haskell dependencies from my system in order to start with a clean installation? Thanks in advance for any answer. Luca. ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Linker error
The relevant GHC ticket is: http://hackage.haskell.org/trac/ghc/ticket/5011, which it seems has already been fixed in HEAD. You can also check this thread on Haskell-Cafe which contains a few workarounds for this problem: http://www.haskell.org/pipermail/haskell-cafe/2011-March/090051.html -Dave On Mar 14, 2011, at 10:55 AM, Don Stewart wrote: There's an open bug ticket about XCode 4 not linking properly (I think due to the new dtrace support making GHC builds tied to a specific XCode version). Can you downgrade to XCode 3 in the meantime? On Mon, Mar 14, 2011 at 8:43 AM, Luca Ciciriello luca_cicirie...@hotmail.com wrote: Hi All. I've just installed the new Haskell platform (2011.2.0.0) on my MacOS X 10.6.6 with Xcode 4 Now the problem is that when I try to build my Haskel programs I receive the linker error: Linking lexer ... ld: library not found for -lcrt1.10.5.o collect2: ld returned 1 exit status Any idea? Thanks in advance. Luca ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
read-only-relocs on 64-bit OS X (was Re: ANNOUNCE: GHC 7.0.2 Release Candidate 1)
I'm getting a warning from the linker when building programs using the 64-bit version of the release candidate on Mac OS X 10.6. $ cat Hello.hs module Main where main = putStrLn Hello, World $ ~/ghc-7/bin/ghc -fforce-recomp Hello.hs [1 of 1] Compiling Main ( Hello.hs, Hello.o ) Linking Hello ... ld: warning: -read_only_relocs cannot be used with x86_64 It doesn't seem to cause a problem when actually running the programs, from what I have seen so far. -Dave On Dec 16, 2010, at 12:36 PM, Ian Lynagh wrote: We are pleased to announce the first release candidate for GHC 7.0.2: http://www.haskell.org/ghc/dist/7.0.2-rc1/ This includes the source tarball, installers for OS X and Windows, and bindists for amd64/Linux, i386/Linux, amd64/FreeBSD and i386/FreeBSD. Please test as much as possible; bugs are much cheaper if we find them before the release! Thanks Ian, on behalf of the GHC team ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: RFC: migrating to git
On Jan 10, 2011, at 5:19 AM, Simon Marlow wrote: We're intrested in opinions from both active and potential GHC developers/contributors. Let us know what you think - would this make life harder or easier for you? Would it make you less likely or more likely to contribute? +1 for moving to git As an infrequent contributor I would welcome the move to git. I think the biggest advantage from my perspective would be enabling branches which I have avoided up to now because of the painful process I hear about from others. Another possible advantage to git would be its support for submodules[1]. If we made the switch to git for all the repositories that GHC uses, then we could set them up as submodules. The advantage of submodules is that the GHC repo would contain pointers to the exact commit needed in the remote repository, and they would be under version control. Having submodules for the other repos would be similar to the darcs_all script, but would not have the danger of leaving [dangling pointers][2] when making a new branch. [1] http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html [2] http://www.haskell.org/pipermail/cvs-ghc/2010-November/057573.html ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [darcs-users] How to develop on a (GHC) branch with darcs
On Dec 8, 2010, at 2:45 AM, Simon Peyton-Jones wrote: If anyone has a favourite how to understand git doc, do point me at it. You may have already tried these, but I've found the [official git tutorial][1] to be pretty decent. The [second part][2] contains some details on how git sees the world. The [everyday git][3] document gives a pretty good idea of how the commands are used in standard workflows. [1]: http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html [2]: http://www.kernel.org/pub/software/scm/git/docs/gittutorial-2.html [3]: http://www.kernel.org/pub/software/scm/git/docs/everyday.html ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Loop optimisation with identical counters
I spent some time looking at the code generated for llvm and the optimizations it can apply. There were quite a bit of details to examine and I wrote it up as blog post here: http://www.dmpots.com/blog/2010/11/05/optimizing-haskell-loops-with-llvm.html. To summarize, I found that it is possible to get LLVM to do this transformation through a combination of tail-call elimination, inlining, induction variable optimization, and global value numbering. This works fine on x86_64 where we can pass parameters in registers, but fails to fully fire in i386 back end because LLVM gets caught up by aliasing problems because function parameters are passed on the stack. The possible aliasing between the stack pointer (Sp) and the function argument pointer (R1) prevented the full transformation, but it was still able to reduce the f loop to straight line code. Exploring the details of the code generation for Haskell loops was a useful exercise. I found several sources of problems for optimizing the generated code. 1. The ability of LLVM to optimize Haskell functions is limited by the calling convention. Particularly for i386, function arguments are passed on a stack that LLVM knows nothing about. The reads and writes to the stack look like arbitrary loads and stores. It has no notion of popping elements from the stack which makes it difficult to know when it is ok to eliminate stores to the stack. 2. The possible aliasing introduced by casting integer arguments (R1-R6) to pointers limits the effectiveness of its optimizations. 3. A single Haskell source function is broken up into many small functions in the back end. Every evaluation of a case statement requires a new continuation point. These small functions kill the optimization context for LLVM. LLVM can recover some of the context by inlining calls to known functions, but the effectiveness of inlining is limited since it does not know that we are passing some parameters on the stack and not through the actual function call. 4. The order of optimizations matter. We saw that just running `-O2` on the code may not be enough to get the full optimization effects. To get the full benefits of inlining in the x86_64 backend, we had to use the heavyweight sequence `-O2 -inline -std-compiler-opts`. I am interested in exploring several different opportunities. * Make the cmm more friendly to LLVM by inlining and making loops in cmm I think LLVM would benefit a lot from having a larger optimization context. We could relieve some of the burden on LLVM by doing some inlining and eliminating tail calls in the cmm itself. GHC knows that it is passing arguments on the stack, so it should be able to inline and turn tail calls into loops much better than LLVM can. * Different calling conventions All the functions in the code generated for LLVM use the same calling convention fixed by GHC. It would be interesting to see if we could generate LLVM code where we pass all the arguments a function needs as actual arguments. We can then let LLVM do its optimizations and then have a later pass that spills extra arguments to the stack and makes our functions use the correct GHC calling convention. * Specialization of code after a runtime alias check We could specialize the code into two cases, one where some pointers may alias and one where they do not. We can then let LLVM fully optimized the code with no aliases. We would insert a check at runtime to see if there are aliases and then call the correct bit of code. * Optimization order matters Probably there are some wins to be had by choosing a good optimization sequence for the code generated from GHC, rather than just using `-O1`, `-O2`, etc. I believe It should be possible to find a good optimization sequence that would work well for Haskell codes. -David On Nov 4, 2010, at 5:29 AM, Christian Hoener zu Siederdissen wrote: Here it is, feel free to change: http://hackage.haskell.org/trac/ghc/ticket/4470 I have added the core for the sub-optimal function 'f'. Criterion benchmarks are there, too. It doesn't make much of a difference for this case -- I'd guess because everything fits into registers here, anyway. Gruss, Christian On 11/04/2010 09:42 AM, Simon Peyton-Jones wrote: Interesting. What would it look like in Core? Anyone care to make a ticket? S | -Original Message- | From: glasgow-haskell-users-boun...@haskell.org [mailto:glasgow-haskell-users- | boun...@haskell.org] On Behalf Of Roman Leshchinskiy | Sent: 03 November 2010 10:55 | To: Christian Hoener zu Siederdissen | Cc: glasgow-haskell-users@haskell.org | Subject: Re: Loop optimisation with identical counters | | LLVM doesn't eliminate the counters. FWIW, fixing this would improve performance of | stream fusion code quite a bit. It's very easy to do in Core. | | Roman | | On 3 Nov 2010, at 10:45, Christian Hoener zu
Re: Loop optimisation with identical counters
Hi Roman, On Nov 5, 2010, at 6:44 PM, Roman Leshchinskiy wrote: On 05/11/2010, at 23:22, David Peixotto wrote: I spent some time looking at the code generated for llvm and the optimizations it can apply. There were quite a bit of details to examine and I wrote it up as blog post here: http://www.dmpots.com/blog/2010/11/05/optimizing-haskell-loops-with-llvm.html. Nice! Thanks a lot for doing that! My pleasure :) To summarize, I found that it is possible to get LLVM to do this transformation through a combination of tail-call elimination, inlining, induction variable optimization, and global value numbering. This works fine on x86_64 where we can pass parameters in registers, but fails to fully fire in i386 back end because LLVM gets caught up by aliasing problems because function parameters are passed on the stack. The possible aliasing between the stack pointer (Sp) and the function argument pointer (R1) prevented the full transformation, but it was still able to reduce the f loop to straight line code. Hmm... IIRC we agreed that Sp is never aliased in GHC-generated code and David Terei (I'm cc'ing here, not sure if he reads the list) made sure to include appropriate annotations in Haskell code. In fact, in your post Sp is passed as i32* noalias nocapture %Sp_Arg. Isn't that enough for LLVM to know that Sp isn't aliased? Yes, the LLVM code has Sp, Hp, Base all annotated as noalias. I believe that Sp, Hp, and Base should never alias, but a (boxed) R1 should always alias with either Sp or Hp. I had a hard time determining exactly how LLVM uses the noalias annotation, but playing with opt -print-alias-sets I saw that Sp was a MayAlias with the pointers derived from R1. I would guess that casting an int to a pointer (like we do for R1) makes that pointer MayAlias with everything regardless of the noalias annotation. 1. The ability of LLVM to optimize Haskell functions is limited by the calling convention. Particularly for i386, function arguments are passed on a stack that LLVM knows nothing about. The reads and writes to the stack look like arbitrary loads and stores. It has no notion of popping elements from the stack which makes it difficult to know when it is ok to eliminate stores to the stack. But shouldn't it just promote stack locations to registers? Yes, LLVM can and will promote the stack locations to registers, but since it doesn't know that Sp is really a stack, it is difficult for it to tell when it can avoid the writes back to the stack even though *we* know they will not be visible once the function call returns. 2. The possible aliasing introduced by casting integer arguments (R1-R6) to pointers limits the effectiveness of its optimizations. Yes, that's a big problem. David tried to solve some of it by including noalias annotations but it's not clear what to do with, say, newly allocated ByteArrays which can't be aliased by anything. It may profitable to write our own alias analysis pass for LLVM that encodes our knowledge of what can alias in the GHC world view. It wouldn't be useful for other LLVM clients, but might be a good option for us. Anyway, it's great to know that there are things we can improve to make LLVM optimise better. Yeah, I'm generally very impressed with what LLVM is able to do with the code from GHC. Any help we can give it will just make it that much better! Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Loop optimisation with identical counters
On Nov 5, 2010, at 7:55 PM, Roman Leshchinskiy wrote: On 06/11/2010, at 00:28, David Peixotto wrote: Yes, the LLVM code has Sp, Hp, Base all annotated as noalias. I believe that Sp, Hp, and Base should never alias, but a (boxed) R1 should always alias with either Sp or Hp. I had a hard time determining exactly how LLVM uses the noalias annotation, but playing with opt -print-alias-sets I saw that Sp was a MayAlias with the pointers derived from R1. I would guess that casting an int to a pointer (like we do for R1) makes that pointer MayAlias with everything regardless of the noalias annotation. Are you sure about R1 aliasing Sp? AFAIK, R1 points to a closure on the heap, not to a stack location. That is, it can alias pointers on the stack or Hp but it can't alias the Sp itself. I don't think Sp can be aliased by anything outside of the garbage collector. Perhaps we shouldn't mark Hp as noalias, though. Well, I'm not sure about R1 aliasing with Sp. I thought that there could be some cases where closures are allocated on the stack, but I could be wrong. I think the stack should still be reachable by the garbage collector though. Can someone more familiar with GHC internals say whether R1 could point to the stack as well as the heap? But shouldn't it just promote stack locations to registers? Yes, LLVM can and will promote the stack locations to registers, but since it doesn't know that Sp is really a stack, it is difficult for it to tell when it can avoid the writes back to the stack even though *we* know they will not be visible once the function call returns. Right, I meant GHC stack locations. Let me rephrase my question: shouldn't it just promote array locations to registers? Yes, it should promote array locations to (virtual) registers. I was mentioning the stack because I was thinking of something like this: x = Sp[0] x = x + 1 Sp[0] = x Sp = Sp - 4 return x where x is a stack allocated parameter. LLVM has no way to know that the write back to the stack (Sp[0] = x) is redundant because it sees Sp as an arbitrary pointer. We know that write is redundant because the stack space is dealloacated before returning x. It may profitable to write our own alias analysis pass for LLVM that encodes our knowledge of what can alias in the GHC world view. It wouldn't be useful for other LLVM clients, but might be a good option for us. Actually, I think our aliasing properties should be fairly close to those of, say, Java. I wonder how LLVM deals with those. That's a good question. I don't think LLVM supports type-based alias analysis which makes it much easier to disambiguate pointers in the Java world. Perhaps type information could help the GHC back end with alias analysis as well. Yeah, I'm generally very impressed with what LLVM is able to do with the code from GHC. Any help we can give it will just make it that much better! I have to say I'm slightly disappointed with what LLVM does with tight loops generated by GHC. That's not necessarily LLVM's fault, you are quite right that we should probably give it more information. Yes, the more that Haskell loops look like the kind of loops that LLVM is accustomed to seeing the better it should be at optimizing them. -David ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Loop optimisation with identical counters
On Nov 5, 2010, at 8:56 PM, Brandon S Allbery KF8NH wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/5/10 19:22 , David Peixotto wrote: Probably there are some wins to be had by choosing a good optimization sequence for the code generated from GHC, rather than just using `-O1`, `-O2`, etc. I believe It should be possible to find a good optimization sequence that would work well for Haskell codes. Didn't someone (dons?) already make a start on this? Searching for good compiler sequences is certainly not a new idea. Most work that I know focuses on finding good sequences for a particular program. I think an interesting opportunity here would be to search for good sequences that generally work well for Haskell programs to replace the standard -O1, -O2 used by LLVM. I would think the code generated from GHC is different enough that we should be able to find standard sequences that we could use to replace the ones currently used by LLVM. - -- brandon s. allbery [linux,solaris,freebsd,perl] allb...@kf8nh.com system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.10 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkzUtdQACgkQIn7hlCsL25VG6ACeJ3sSXoI4YLbXW3KIFVMqKqdK oTsAn23bxl0mvfdl3up69xM4qWPnklGj =TXBk -END PGP SIGNATURE- ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: GHC.Types consturctors with #
Hi Larry, GHC allows you to work with unboxed types. Int# is the type of unboxed ints. I# is a normal data constructor. So we can see that GHC represents a (boxed) Int as a normal algebraic data type data Int = I# Int# which says that an Int is a type with a single constructor (I#) that wraps a machine integer (Int#). By convention, unboxed types use a # in their name. You can find more info about unboxed types here: http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/primitives.html#glasgow-unboxed To work with unboxed types in your code (or ghci) you need the MagicHash extension: http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/syntax-extns.html#magic-hash $ ghci -XMagicHash GHCi, version 6.12.3: http://www.haskell.org/ghc/ :? for help Loading package ghc-prim ... linking ... done. Loading package integer-gmp ... linking ... done. Loading package base ... linking ... done. Loading package ffi-1.0 ... linking ... done. Prelude import GHC.Types Prelude GHC.Types I# 5# 5 Prelude GHC.Types -David On Nov 1, 2010, at 12:40 PM, Larry Evans wrote: http://www.haskell.org/ghc/docs/6.10.2/html/libraries/ghc-prim/GHC-Types.html contains: data Int = I# Int# What does I# Int# mean? I've tried a simple interpretation: Prelude GHC.Types I# 5# interactive:1:5: parse error (possibly incorrect indentation) Prelude GHC.Types but obviously that failed :( TIA. -Larry ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: how to terminate an external program after timeout?
On Sep 9, 2010, at 6:37 AM, Simon Marlow wrote: On 09/09/2010 10:39, Christian Maeder wrote: Christian Maeder schrieb: Hi, we call from our haskell application the metis prover via System.Process.readProcessWithExitCode metis filename However, we are not able to get rid of this process if metis does not terminate by itself. In particular, wrapping this call into a System.Timeout.timeout does not work. timeout works so far as it is possible to start another action, but the continuing metis process still blocks the whole system. C. Any suggestions how we should handle this ideally portably but first of all under unix. (ghc-6.12.3) Take a look at the timeout program in GHC's test suite: http://darcs.haskell.org/testsuite/timeout/timeout.hs In case it's not obvious, I believe this has to be compiled with -threaded to get the desired behavior. -David Cheers, Simon ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users