Re: GHC, Clang XCode 4.2

2011-10-12 Thread David Peixotto

On Oct 12, 2011, at 5:26 AM, Simon Marlow wrote:

 On 11/10/2011 18:45, David Peixotto wrote:
 Ok, I have attached a set of patches to support building the GHC
 runtime with llvm-gcc. The patches are based off of commit
 29a97fded4010bd01aa0a17945c84258e285d421 which was last Friday's HEAD.
 These patches are also available from my github repository on the
 llvm-gcc branch at
 
git://github.com/dmpots/ghc.git
 
 There are three patches:
 
  0001- Uses pthread_getspecific and pthread_setspecfic to
 access gct when the `llvm_CC_FLAVOR` CPP macro is
 set
 
  0002- Modifies the configure scripts to set the
 `llvm_CC_FLAVOR` macro when compiling with an llvm
 based compile (either llvm-gcc or clang)
 
  0003- Passes the gct variable as a parameter in the GC. This
 change is parameterized with CPP macros so that it
 is only in effect when compiling for an llvm-based
 compiler.
 
 The patches 0001 and 0002 provide the minimal support needed to build
 GHC with llvm-gcc. The 0003 patch is there to limit the performance
 hit we get by going through pthread functions to access the gct. I
 think the 0001 and 0002 patches should not be very controversial, but
 the 0003 patch is a more invasive change and perhaps Simon Marlow will
 want to clean it up before it is applied.
 
 Thanks, I'll take a look at these soon.
 
 Just a thought, but someone might want to write a blog post about how Apple's 
 choice to move to llvm-gcc is imposing a performance penalty on us here, and 
 get it up on Reddit.  That would give the issue some publicity (they love 
 that sort of thing on Reddit), and might result in some action.  I'd be happy 
 to proof read a blog post before publication.  Some simple benchmarks would 
 be needed - one option is to use GHC itself with the various combinations of 
 compilers + RTS changes, or there are a small set of GC benchmarks in 
 nofib/gc.

Since I have the different versions already compiled I can do the benchmarking. 
I'll probably use GHC and fibon as the benchmarks since I know how to configure 
those the best.

It shouldn't be too much work to write a discussion on the problems we are 
having so I'd be willing to write the post too. It will take a few days before 
I can put it all together, but I'll send a draft out when I have something 
passable.

 Cheers,
   Simon
 
 
 
 I ran a validate with the patches, and found one additional failure
 when going through an llvm-based compiler. There were no additional
 failures when using a gcc compiler even with my patches applied. The
 additional failure is the cgrun071 test which tests the popCnt
 primitives. I'm going to look into why that test fails, but I think
 the patches should be safe to apply as it would only show up when
 compiling with llvm-gcc, which is currently impossible without these
 patches.
 
 -David
 
 
 
 
 
 
 On Oct 7, 2011, at 10:30 AM, David Peixotto wrote:
 
 
 On Oct 6, 2011, at 7:32 AM, Simon Marlow wrote:
 
 On 05/10/2011 09:46, austin seipp wrote:
 There has been recent discussion on the Homebrew bug tracker
 concerning the upcoming XCode 4.2 release by Apple, which has
 apparently just gone GM (meaning they're going to make a real release
 on the app store Real Soon Now.)
 
 The primary concern is that XCode will no longer ship GCC 4.2 at all,
 it seems. XCode 4.0   4.1 merely set 'llvm-gcc' as the default
 compiler, and GHC's `configure` script was adjusted to find the
 `gcc-4.2` binary. If you have old XCode's installed, then you may have
 the binaries laying around, but I doubt they'll be on your $PATH, and
 anybody doing a fresh install is SOL.
 
 It seems Clang 3.0 will now be the default compiler, with llvm-gcc as
 a deprecated option, probably removed in XCode 4.3. It doesn't matter
 really, because both of them do not work with GHC, because of its use
 of either A) Global register variables of any kind, or B) the __thread
 storage modifier.
 
 David Peixotto did some work on this not too long ago as the issue of
 compiling with Clang was raised. His branches include changes which
 make the 'gct' variable use pthread_getspecific rather than __thread
 for TLS and then as an optimization use inline ASM to grab the value
 out of the variable, with an impact of about 9% it seems, but that's
 on nofib and I don't know if it was -threaded. He also included a
 version which passes the 'gct' around as a parameter to all GCC
 functions which is a bit uglier but may give some better performance I
 guess. (The discussion is from here IIRC.) I suppose the real perf
 killer here is probably -threaded code.
 
 https://github.com/dmpots/ghc
 
 Was there ever any decision on which route to take for this issue? The
 parameter passing solution looks quite uglier IMO but it may be
 necessary for performance.
 
 I'm happy to incorporate the parameter-passing changes if necessary.  I 
 think

Re: GHC, Clang XCode 4.2

2011-10-07 Thread David Peixotto

On Oct 6, 2011, at 7:32 AM, Simon Marlow wrote:

 On 05/10/2011 09:46, austin seipp wrote:
 There has been recent discussion on the Homebrew bug tracker
 concerning the upcoming XCode 4.2 release by Apple, which has
 apparently just gone GM (meaning they're going to make a real release
 on the app store Real Soon Now.)
 
 The primary concern is that XCode will no longer ship GCC 4.2 at all,
 it seems. XCode 4.0  4.1 merely set 'llvm-gcc' as the default
 compiler, and GHC's `configure` script was adjusted to find the
 `gcc-4.2` binary. If you have old XCode's installed, then you may have
 the binaries laying around, but I doubt they'll be on your $PATH, and
 anybody doing a fresh install is SOL.
 
 It seems Clang 3.0 will now be the default compiler, with llvm-gcc as
 a deprecated option, probably removed in XCode 4.3. It doesn't matter
 really, because both of them do not work with GHC, because of its use
 of either A) Global register variables of any kind, or B) the __thread
 storage modifier.
 
 David Peixotto did some work on this not too long ago as the issue of
 compiling with Clang was raised. His branches include changes which
 make the 'gct' variable use pthread_getspecific rather than __thread
 for TLS and then as an optimization use inline ASM to grab the value
 out of the variable, with an impact of about 9% it seems, but that's
 on nofib and I don't know if it was -threaded. He also included a
 version which passes the 'gct' around as a parameter to all GCC
 functions which is a bit uglier but may give some better performance I
 guess. (The discussion is from here IIRC.) I suppose the real perf
 killer here is probably -threaded code.
 
 https://github.com/dmpots/ghc
 
 Was there ever any decision on which route to take for this issue? The
 parameter passing solution looks quite uglier IMO but it may be
 necessary for performance.
 
 I'm happy to incorporate the parameter-passing changes if necessary.  I think 
 it should only be important in the inner loop of the GC 
 (scavenge_block/evacuate and the functions called from there).  If someone 
 sends me a working patch, I can clean it up as much as possible.

I will take a look at getting the patches to work with GHC head. To support 
llvm-gcc we need two basic things:

1. autoconf support to detect when we are compiling with llvm-gcc
2. a work-around for tls in the gc

My old patches take care of both 1 and 2. For #2 the easy approach is using 
pthread_getspecific which is a pretty small change. Passing gct as a parameter 
is much uglier more invasive change, but I had it working at some point. It 
could perhaps be made less ugly by only passing it around in the important 
functions.

Supporting clang is going to require more changes. The last time I tried there 
were problems with the preprocessor that made it not work. I just tried again 
yesterday and ran into a problem generating makefile dependencies because 
apparently clang does not allow both the -MM and -MF flags 
(http://llvm.org/bugs/show_bug.cgi?id=8312).


 Note that some of the overhead measured by David is coming from using the 
 LLVM backend to gcc instead of gcc's own backend.  In my own measurements a 
 few months ago I found LLVM generated slower (but smaller) code for the GC.  
 Maybe this will change over time.  It's also possible that the current GC 
 code is tuned for gcc - at various times in the past I've gone through and 
 tweaked the code to get good assembly out (much as we teak Haskell to get 
 good Core :-).
 
 Cheers,
   Simon
 
 
 I'm just posting this here as a reminder as this is probably going to
 become a problem pretty quickly for anybody who uses Lion or modern
 XCode and also likes using GHC, so it should probably be sorted out.
 :) I'm still on SL using XCode 4 so it's not an issue for me, but it
 will be for any future Mac endeavors. Hopefully they get support for
 __thread or something equivalent soon, because nobody likes
 performance hits, but it doesn't seem like we have a choice.
 
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC 7.0.4 on Lion

2011-07-25 Thread David Peixotto
I think the warnings are not a big concern. I silence both of them by adding 
-optl-Wl,-no_compact_unwind,-no_pie to my ghc options in /usr/bin/ghc.

In 10.7 they changed the default linking options to create a PIE (position 
independent executable). To create a PIE you have to compile all code as 
position independent, which is the default option of GHC on mac os x. For 
performance reasons some code is compiled with absolute references (like the 
gmp library code in your example) so it cannot be used when creating a PIE. The 
advantage of a PIE executable is that it is more secure because the OS can load 
it at a random base address.

I believe the compact unwind warning is related to the creation of unwind 
frames for error handling with exceptions in languages like C++. There are some 
more details in this trac ticket: 
http://hackage.haskell.org/trac/ghc/ticket/5019. I'm not sure what the 
advantage of the compact unwind is, but it sounds like it could make the 
executable smaller.

-David

On Jul 25, 2011, at 7:59 AM, Luca Ciciriello wrote:

 Hi All.
 I've installed on my Mac the new MacOS X 10.7 (Lion) with Xcode 4.1
 
 Using ghc 7.0.4 (64-bit) on that system a get the following warnings in the 
 linking phase:
 
 Linking hslint ...
 ld: warning: could not create compact unwind for _ffi_call_unix64: does not 
 use RBP or RSP based frame
 ld: warning: PIE disabled. Absolute addressing (perhaps -mdynamic-no-pic) not 
 allowed in code signed PIE, but used in ___gmpn_modexact_1c_odd from 
 /Library/Frameworks/GHC.framework/Versions/7.0.4-x86_64/usr/lib/ghc-7.0.4/integer-gmp-0.2.0.3/libHSinteger-gmp-0.2.0.3.a(mode1o.o).
  To fix this warning, don't compile with -mdynamic-no-pic or link with 
 -Wl,-no_pie
 
 Is this something to worrying about?
 
 Thanks in advance for any answer.
 
 Luca
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: HEAD doesn't not build on OS X

2011-07-20 Thread David Peixotto
I've attached a small patch that seems to fix the build. The problem was that 
the `dtraceSparkCounters` function was getting called even when we were not 
compiling for the threaded rts. The patch just moves the call inside the #ifdef 
for the threaded rts.

The call had been placed outside the #ifdef for the
THREADED_RTS symbol which caused a compile error since
the spark_stats field of a capability is only available when
compiling for the threaded rts.



0001-Only-call-dtraceSparkCounters-in-threaded-rts.patch
Description: Binary data


On Jul 20, 2011, at 1:29 PM, Johan Tibell wrote:

 Still broken, but now for some other reason:
 
 inplace/bin/ghc-stage1 -optc-Wall -optc-Wextra
 -optc-Wstrict-prototypes -optc-Wmissing-prototypes
 -optc-Wmissing-declarations -optc-Winline -optc-Waggregate-return
 -optc-Wpointer-arith -optc-Wmissing-noreturn -optc-Wnested-externs
 -optc-Wredundant-decls -optc-Iincludes -optc-Irts -optc-DCOMPILING_RTS
 -optc-fno-strict-aliasing -optc-fno-common -optc-Ilibffi/build/include
 -optc-DDTRACE -optc-fomit-frame-pointer -optc-DRtsWay=\rts_v\
 -H64m -O -fasm -Iincludes -Irts -DCOMPILING_RTS -package-name rts
 -dcmm-lint  -Ilibffi/build/include -DDTRACE -i -irts
 -irts/dist/build -irts/dist/build/autogen -Irts/dist/build
 -Irts/dist/build/autogen   -optc-O2   -c rts/ClosureFlags.c -o
 rts/dist/build/ClosureFlags.o
 In file included from rts/Schedule.h:15,
 
 from rts/Capability.c:23:0:
 rts/Trace.h: In function ‘traceSparkCounters’:
 
 rts/Trace.h:516:0:
 error: ‘Capability’ has no member named ‘spark_stats’
 
 rts/Trace.h:516:0:
 error: ‘Capability’ has no member named ‘spark_stats’
 
 rts/Trace.h:516:0:
 error: ‘Capability’ has no member named ‘spark_stats’
 
 rts/Trace.h:516:0:
 error: ‘Capability’ has no member named ‘spark_stats’
 
 rts/Trace.h:516:0:
 error: ‘Capability’ has no member named ‘spark_stats’
 
 rts/Trace.h:516:0:
 error: ‘Capability’ has no member named ‘spark_stats’
 
 rts/Trace.h:516:0:
 warning: implicit declaration of function ‘sparkPoolSize’
 
 rts/Trace.h:516:0:
 warning: nested extern declaration of ‘sparkPoolSize’
 
 rts/Trace.h:516:0:
 error: ‘Capability’ has no member named ‘sparks’
 make[1]: *** [rts/dist/build/Capability.o] Error 1
 make[1]: *** Waiting for unfinished jobs
 make: *** [all] Error 2
 
 On Tue, Jul 19, 2011 at 9:55 PM, Johan Tibell johan.tib...@gmail.com wrote:
 Duncan, could you please take a look.
 
 On Tue, Jul 19, 2011 at 9:51 PM, Johan Tibell johan.tib...@gmail.com wrote:
 I just unpulled all the new GHC event patches, starting with
 d77df1caad3a5f833aac9275938a0675e1ee6aac, and the build is chugging
 along.
 
 On Tue, Jul 19, 2011 at 8:38 PM, Johan Tibell johan.tib...@gmail.com 
 wrote:
 While trying to build head (from a maintainer-clean tree) I get the
 following error:
 
 echo compiler_stage1_depfile_haskell_EXISTS = YES 
 compiler/stage1/build/.depend-v.haskell.tmp
 for dir in compiler/stage1/build/./ compiler/stage1/build/Llvm/
 compiler/stage1/build/LlvmCodeGen/ compiler/stage1/build/PPC/
 compiler/stage1/build/RegAlloc/ compiler/stage1/build/RegAlloc/Graph/
 compiler/stage1/build/RegAlloc/Linear/
 compiler/stage1/build/RegAlloc/Linear/PPC/
 compiler/stage1/build/RegAlloc/Linear/SPARC/
 compiler/stage1/build/RegAlloc/Linear/X86/
 compiler/stage1/build/SPARC/ compiler/stage1/build/SPARC/CodeGen/
 compiler/stage1/build/Vectorise/
 compiler/stage1/build/Vectorise/Builtins/
 compiler/stage1/build/Vectorise/Monad/
 compiler/stage1/build/Vectorise/Type/
 compiler/stage1/build/Vectorise/Utils/ compiler/stage1/build/X86/; do
 if test ! -d $dir; then mkdir -p $dir; fi done
 grep -v ' : [a-zA-Z]:/' compiler/stage1/build/.depend-v.haskell.tmp 
 compiler/stage1/build/.depend-v.haskell
 /usr/sbin/dtrace -Iincludes -Irts -Ilibffi/build/include -C -x
 cpppath=/usr/bin/gcc -h -o rts/dist/build/RtsProbes.h -s
 rts/RtsProbes.d
 . includes/HsFFI.h
 .. includes/ghcconfig.h
 ... includes/ghcautoconf.h
 ... includes/ghcplatform.h
 .. includes/stg/Types.h
 .. /usr/lib/gcc/i686-apple-darwin10/4.2.1/include/stdint.h
 .. /usr/lib/gcc/i686-apple-darwin10/4.2.1/include/float.h
 . includes/rts/EventLogFormat.h
 dtrace: failed to compile script rts/RtsProbes.d: line 71: parameter
 is already declared in probe input prototype: StgWord, parameter #6
 make[1]: *** [rts/dist/build/RtsProbes.h] Error 1
 make: *** [all] Error 2
 
 
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Profile: zero total time

2011-07-08 Thread David Peixotto
Does it make a difference if you use the threaded vs. non-threaded runtime? I'm 
seeing the odd behavior on Mac, but only for the single-threaed runtime.

http://hackage.haskell.org/trac/ghc/ticket/5282#comment:8

On Jul 7, 2011, at 2:45 PM, Matthew Farkas-Dyck wrote:

 Sorry, I ought to have mentioned:
 
 $ uname -sr
 Linux 2.6.38
 
 On 7 July 2011 14:03, Daniel Fischer daniel.is.fisc...@googlemail.com wrote:
 On Thursday 07 July 2011, 20:44:57, Matthew Farkas-Dyck wrote:
 I am trying to take a profile of a program, but when I run it, the
 total time (as given in the profiling report file) is zero!
 
 If you're on a Mac, it could be
 
 http://hackage.haskell.org/trac/ghc/ticket/5282
 
 
 
 
 -- 
 Matthew Farkas-Dyck
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Installation

2011-04-20 Thread David Peixotto
Perhaps the /Users/MyUser/.ghc folder is causing your problem?

On Apr 20, 2011, at 2:33 AM, Luca Ciciriello wrote:

 Hi All.
 I'm using GHC with MacOS X 10.6.7 (Xcode4). I've installed GHC 7.0.3 and the 
 HackageDB package hsgsom. Then, for my motivation, I've uninstalled GHC.
 To remove GHC I've used the uninstaller tool and I've manually removed the 
 folder /Library/Frameworks/GHC.framework. I've also manually removed the 
 folder /Users/MyUser/Library/Haskell and the folder /Users/MyUser/.cabal
 
 Now I've installed GHC again, but when I try to install the package hsgsom 
 cabal tells to me that the package is already installed and I have to use the 
 --reinstall flag. So, where the information of the installed packages are 
 stored on my system? How can I remove all Haskell dependencies from my system 
 in order to start with a clean installation?
 
 Thanks in advance for any answer.
 
 Luca.
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Linker error

2011-03-14 Thread David Peixotto
The relevant GHC ticket is: http://hackage.haskell.org/trac/ghc/ticket/5011, 
which it seems has already been fixed in HEAD.

You can also check this thread on Haskell-Cafe which contains a few workarounds 
for this problem: 
http://www.haskell.org/pipermail/haskell-cafe/2011-March/090051.html

-Dave

On Mar 14, 2011, at 10:55 AM, Don Stewart wrote:

 There's an open bug ticket about XCode 4 not linking properly (I think
 due to the new dtrace support making GHC builds tied to a specific
 XCode version).
 
 Can you downgrade to XCode 3 in the meantime?
 
 On Mon, Mar 14, 2011 at 8:43 AM, Luca Ciciriello
 luca_cicirie...@hotmail.com wrote:
 Hi All.
 I've just installed the new Haskell platform (2011.2.0.0) on my MacOS X 
 10.6.6 with Xcode 4
 
 Now the problem is that when I try to build my Haskel programs I receive the 
 linker error:
 
 Linking lexer ...
 ld: library not found for -lcrt1.10.5.o
 collect2: ld returned 1 exit status
 
 Any idea?
 
 Thanks in advance.
 
 Luca
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


read-only-relocs on 64-bit OS X (was Re: ANNOUNCE: GHC 7.0.2 Release Candidate 1)

2011-02-23 Thread David Peixotto
I'm getting a warning from the linker when building programs using the 64-bit 
version of the release candidate on Mac OS X 10.6.

$ cat Hello.hs
module Main where
main = putStrLn Hello, World

$ ~/ghc-7/bin/ghc -fforce-recomp Hello.hs 
[1 of 1] Compiling Main ( Hello.hs, Hello.o )
Linking Hello ...
ld: warning: -read_only_relocs cannot be used with x86_64

It doesn't seem to cause a problem when actually running the programs, from 
what I have seen so far.

-Dave

On Dec 16, 2010, at 12:36 PM, Ian Lynagh wrote:

 
 We are pleased to announce the first release candidate for GHC 7.0.2:
 
http://www.haskell.org/ghc/dist/7.0.2-rc1/
 
 This includes the source tarball, installers for OS X and Windows, and
 bindists for amd64/Linux, i386/Linux, amd64/FreeBSD and i386/FreeBSD.
 
 Please test as much as possible; bugs are much cheaper if we find them
 before the release!
 
 
 Thanks
 Ian, on behalf of the GHC team
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: migrating to git

2011-01-10 Thread David Peixotto
On Jan 10, 2011, at 5:19 AM, Simon Marlow wrote:
 
 We're intrested in opinions from both active and potential GHC 
 developers/contributors.  Let us know what you think - would this make life 
 harder or easier for you?  Would it make you less likely or more likely to 
 contribute?

+1 for moving to git

As an infrequent contributor I would welcome the move to git. I think the 
biggest advantage from my perspective would be enabling branches which I have 
avoided up to now because of the painful process I hear about from others.

Another possible advantage to git would be its support for submodules[1]. If we 
made the switch to git for all the repositories that GHC uses, then we could 
set them up as submodules. The advantage of submodules is that the GHC repo 
would contain pointers to the exact commit needed in the remote repository, and 
they would be under version control. Having submodules for the other repos 
would be similar to the darcs_all script, but would not have the danger of 
leaving [dangling pointers][2] when making a new branch.

[1] http://www.kernel.org/pub/software/scm/git/docs/git-submodule.html 
[2] http://www.haskell.org/pipermail/cvs-ghc/2010-November/057573.html
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [darcs-users] How to develop on a (GHC) branch with darcs

2010-12-08 Thread David Peixotto

On Dec 8, 2010, at 2:45 AM, Simon Peyton-Jones wrote:

 If anyone has a favourite how to understand git doc, do point me at it.

You may have already tried these, but I've found the [official git tutorial][1] 
to be pretty decent. The [second part][2] contains some details on how git sees 
the world. The [everyday git][3] document gives a pretty good idea of how the 
commands are used in standard workflows.

[1]: http://www.kernel.org/pub/software/scm/git/docs/gittutorial.html
[2]: http://www.kernel.org/pub/software/scm/git/docs/gittutorial-2.html
[3]: http://www.kernel.org/pub/software/scm/git/docs/everyday.html
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop optimisation with identical counters

2010-11-05 Thread David Peixotto
I spent some time looking at the code generated for llvm and the optimizations
it can apply. There were quite a bit of details to examine and I wrote it up
as blog post here:
http://www.dmpots.com/blog/2010/11/05/optimizing-haskell-loops-with-llvm.html.

To summarize, I found that it is possible to get LLVM to do this
transformation through a combination of tail-call elimination, inlining,
induction variable optimization, and global value numbering. This works fine
on x86_64 where we can pass parameters in registers, but fails to fully fire
in i386 back end because LLVM gets caught up by aliasing problems because
function parameters are passed on the stack. The possible aliasing between the
stack pointer (Sp) and the function argument pointer (R1) prevented the full
transformation, but it was still able to reduce the f loop to straight line 
code.

Exploring the details of the code generation for Haskell loops was a useful
exercise. I found several sources of problems for optimizing the generated
code.

1. The ability of LLVM to optimize Haskell functions is limited by the calling
convention. Particularly for i386, function arguments are passed on a stack
that LLVM knows nothing about. The reads and writes to the stack look like
arbitrary loads and stores. It has no notion of popping elements from the
stack which makes it difficult to know when it is ok to eliminate stores to
the stack. 

2. The possible aliasing introduced by casting integer arguments
(R1-R6) to pointers limits the effectiveness of its optimizations.

3. A single Haskell source function is broken up into many small functions in
the back end. Every evaluation of a case statement requires a new continuation
point. These small functions kill the optimization context for LLVM. LLVM can
recover some of the context by inlining calls to known functions, but the
effectiveness of inlining is limited since it does not know that we are
passing some parameters on the stack and not through the actual function call.

4. The order of optimizations matter. We saw that just running `-O2` on the
code may not be enough to get the full optimization effects. To get the full
benefits of inlining in the x86_64 backend, we had to use the heavyweight
sequence `-O2 -inline -std-compiler-opts`.

I am interested in exploring several different opportunities.

* Make the cmm more friendly to LLVM by inlining and making loops in cmm

I think LLVM would benefit a lot from having a larger optimization
context. We could relieve some of the burden on LLVM by doing some
inlining and eliminating tail calls in the cmm itself. GHC knows that it
is passing arguments on the stack, so it should be able to inline and turn
tail calls into loops much better than LLVM can.

*  Different calling conventions

All the functions in the code generated for LLVM use the same calling
convention fixed by GHC. It would be interesting to see if we could
generate LLVM code where we pass all the arguments a function needs as
actual arguments. We can then let LLVM do its optimizations and then have
a later pass that spills extra arguments to the stack and makes our
functions use the correct GHC calling convention.

*  Specialization of code after a runtime alias check

We could specialize the code into two cases, one where some pointers may
alias and one where they do not. We can then let LLVM fully optimized the
code with no aliases. We would insert a check at runtime to see if there
are aliases and then call the correct bit of code.

*  Optimization order matters

Probably there are some wins to be had by choosing a good optimization
sequence for the code generated from GHC, rather than just using `-O1`,
`-O2`, etc. I believe It should be possible to find a good optimization
sequence that would work well for Haskell codes.

-David

On Nov 4, 2010, at 5:29 AM, Christian Hoener zu Siederdissen wrote:

 Here it is, feel free to change:
 http://hackage.haskell.org/trac/ghc/ticket/4470
 
 I have added the core for the sub-optimal function 'f'. Criterion benchmarks 
 are there, too. It
 doesn't make much of a difference for this case -- I'd guess because 
 everything fits into registers
 here, anyway.
 
 Gruss,
 Christian
 
 On 11/04/2010 09:42 AM, Simon Peyton-Jones wrote:
 Interesting.  What would it look like in Core?  Anyone care to make a ticket?
 
 S
 
 |  -Original Message-
 |  From: glasgow-haskell-users-boun...@haskell.org 
 [mailto:glasgow-haskell-users-
 |  boun...@haskell.org] On Behalf Of Roman Leshchinskiy
 |  Sent: 03 November 2010 10:55
 |  To: Christian Hoener zu Siederdissen
 |  Cc: glasgow-haskell-users@haskell.org
 |  Subject: Re: Loop optimisation with identical counters
 |  
 |  LLVM doesn't eliminate the counters. FWIW, fixing this would improve 
 performance of
 |  stream fusion code quite a bit. It's very easy to do in Core.
 |  
 |  Roman
 |  
 |  On 3 Nov 2010, at 10:45, Christian Hoener zu 

Re: Loop optimisation with identical counters

2010-11-05 Thread David Peixotto
Hi Roman,

On Nov 5, 2010, at 6:44 PM, Roman Leshchinskiy wrote:

 On 05/11/2010, at 23:22, David Peixotto wrote:
 
 I spent some time looking at the code generated for llvm and the 
 optimizations
 it can apply. There were quite a bit of details to examine and I wrote it up
 as blog post here:
 http://www.dmpots.com/blog/2010/11/05/optimizing-haskell-loops-with-llvm.html.
 
 Nice! Thanks a lot for doing that!
My pleasure :)

 
 To summarize, I found that it is possible to get LLVM to do this
 transformation through a combination of tail-call elimination, inlining,
 induction variable optimization, and global value numbering. This works fine
 on x86_64 where we can pass parameters in registers, but fails to fully fire
 in i386 back end because LLVM gets caught up by aliasing problems because
 function parameters are passed on the stack. The possible aliasing between 
 the
 stack pointer (Sp) and the function argument pointer (R1) prevented the full
 transformation, but it was still able to reduce the f loop to straight line 
 code.
 
 Hmm... IIRC we agreed that Sp is never aliased in GHC-generated code and 
 David Terei (I'm cc'ing here, not sure if he reads the list) made sure to 
 include appropriate annotations in Haskell code. In fact, in your post Sp is 
 passed as i32* noalias nocapture %Sp_Arg. Isn't that enough for LLVM to know 
 that Sp isn't aliased?

Yes, the LLVM code has Sp, Hp, Base all annotated as noalias. I believe that 
Sp, Hp, and Base should never alias, but a (boxed) R1 should always alias with 
either Sp or Hp. I had a hard time determining exactly how LLVM uses the 
noalias annotation, but playing with opt -print-alias-sets I saw that Sp was a 
MayAlias with the pointers derived from R1. I would guess that casting an int 
to a pointer (like we do for R1) makes that pointer MayAlias with everything 
regardless of the noalias annotation.

 1. The ability of LLVM to optimize Haskell functions is limited by the 
 calling
 convention. Particularly for i386, function arguments are passed on a stack
 that LLVM knows nothing about. The reads and writes to the stack look like
 arbitrary loads and stores. It has no notion of popping elements from the
 stack which makes it difficult to know when it is ok to eliminate stores to
 the stack. 
 
 But shouldn't it just promote stack locations to registers?

Yes, LLVM can and will promote the stack locations to registers, but since it 
doesn't know that Sp is really a stack, it is difficult for it to tell when it 
can avoid the writes back to the stack even though *we* know they will not be 
visible once the function call returns. 

 
 2. The possible aliasing introduced by casting integer arguments
 (R1-R6) to pointers limits the effectiveness of its optimizations.
 
 Yes, that's a big problem. David tried to solve some of it by including 
 noalias annotations but it's not clear what to do with, say, newly allocated 
 ByteArrays which can't be aliased by anything.

It may profitable to write our own alias analysis pass for LLVM that encodes 
our knowledge of what can alias in the GHC world view. It wouldn't be useful 
for other LLVM clients, but might be a good option for us.

 
 Anyway, it's great to know that there are things we can improve to make LLVM 
 optimise better.

Yeah, I'm generally very impressed with what LLVM is able to do with the code 
from GHC. Any help we can give it will just make it that much better!

 Roman
 
 
 

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop optimisation with identical counters

2010-11-05 Thread David Peixotto

On Nov 5, 2010, at 7:55 PM, Roman Leshchinskiy wrote:

 On 06/11/2010, at 00:28, David Peixotto wrote:
 
 Yes, the LLVM code has Sp, Hp, Base all annotated as noalias. I believe that 
 Sp, Hp, and Base should never alias, but a (boxed) R1 should always alias 
 with either Sp or Hp. I had a hard time determining exactly how LLVM uses 
 the noalias annotation, but playing with opt -print-alias-sets I saw that Sp 
 was a MayAlias with the pointers derived from R1. I would guess that casting 
 an int to a pointer (like we do for R1) makes that pointer MayAlias with 
 everything regardless of the noalias annotation.
 
 Are you sure about R1 aliasing Sp? AFAIK, R1 points to a closure on the heap, 
 not to a stack location. That is, it can alias pointers on the stack or Hp 
 but it can't alias the Sp itself. I don't think Sp can be aliased by anything 
 outside of the garbage collector.
 
 Perhaps we shouldn't mark Hp as noalias, though.

Well, I'm not sure about R1 aliasing with Sp. I thought that there could be 
some cases where closures are allocated on the stack, but I could be wrong. I 
think the stack should still be reachable by the garbage collector though. Can 
someone more familiar with GHC internals say whether R1 could point to the 
stack as well as the heap?

 
 But shouldn't it just promote stack locations to registers?
 
 Yes, LLVM can and will promote the stack locations to registers, but since 
 it doesn't know that Sp is really a stack, it is difficult for it to tell 
 when it can avoid the writes back to the stack even though *we* know they 
 will not be visible once the function call returns.
 
 Right, I meant GHC stack locations. Let me rephrase my question: shouldn't it 
 just promote array locations to registers?

Yes, it should promote array locations to (virtual) registers. I was mentioning 
the stack because I was thinking of something like this:

x = Sp[0]
x = x + 1
Sp[0] = x
Sp = Sp - 4
return x

where x is a stack allocated parameter. LLVM has no way to know that the write 
back to the stack (Sp[0] = x) is redundant because it sees Sp as an arbitrary 
pointer. We know that write is redundant because the stack space is 
dealloacated before returning x.

 
 It may profitable to write our own alias analysis pass for LLVM that encodes 
 our knowledge of what can alias in the GHC world view. It wouldn't be useful 
 for other LLVM clients, but might be a good option for us.
 
 Actually, I think our aliasing properties should be fairly close to those of, 
 say, Java. I wonder how LLVM deals with those.

That's a good question. I don't think LLVM supports type-based alias analysis 
which makes it much easier to disambiguate pointers in the Java world. Perhaps 
type information could help the GHC back end with alias analysis as well.

 
 Yeah, I'm generally very impressed with what LLVM is able to do with the 
 code from GHC. Any help we can give it will just make it that much better!
 
 I have to say I'm slightly disappointed with what LLVM does with tight loops 
 generated by GHC. That's not necessarily LLVM's fault, you are quite right 
 that we should probably give it more information.
 

Yes, the more that Haskell loops look like the kind of loops that LLVM is 
accustomed to seeing the better it should be at optimizing them.

-David

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop optimisation with identical counters

2010-11-05 Thread David Peixotto

On Nov 5, 2010, at 8:56 PM, Brandon S Allbery KF8NH wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 11/5/10 19:22 , David Peixotto wrote:
Probably there are some wins to be had by choosing a good optimization
sequence for the code generated from GHC, rather than just using `-O1`,
`-O2`, etc. I believe It should be possible to find a good optimization
sequence that would work well for Haskell codes.
 
 Didn't someone (dons?) already make a start on this?

Searching for good compiler sequences is certainly not a new idea. Most work 
that I know focuses on finding good sequences for a particular program. I think 
an interesting opportunity here would be to search for good sequences that 
generally work well for Haskell programs to replace the standard -O1, -O2 used 
by LLVM. I would think the code generated from GHC is different enough that we 
should be able to find standard sequences that we could use to replace the ones 
currently used by LLVM.

 
 - -- 
 brandon s. allbery [linux,solaris,freebsd,perl]  allb...@kf8nh.com
 system administrator  [openafs,heimdal,too many hats]  allb...@ece.cmu.edu
 electrical and computer engineering, carnegie mellon university  KF8NH
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.10 (Darwin)
 Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
 
 iEYEARECAAYFAkzUtdQACgkQIn7hlCsL25VG6ACeJ3sSXoI4YLbXW3KIFVMqKqdK
 oTsAn23bxl0mvfdl3up69xM4qWPnklGj
 =TXBk
 -END PGP SIGNATURE-
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: GHC.Types consturctors with #

2010-11-01 Thread David Peixotto
Hi Larry,

GHC allows you to work with unboxed types. Int# is the type of unboxed ints. I# 
is a normal data constructor. So we can see that GHC represents a (boxed) Int 
as a normal algebraic data type

data Int = I# Int#

which says that an Int is a type with a single constructor (I#) that wraps a 
machine integer (Int#). By convention, unboxed types use a # in their name.

You can find more info about unboxed types here: 
http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/primitives.html#glasgow-unboxed

To work with unboxed types in your code (or ghci) you need the MagicHash 
extension: 
http://www.haskell.org/ghc/docs/6.12.2/html/users_guide/syntax-extns.html#magic-hash

$ ghci -XMagicHash
GHCi, version 6.12.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
Prelude import GHC.Types
Prelude GHC.Types I# 5#
5
Prelude GHC.Types 

-David


On Nov 1, 2010, at 12:40 PM, Larry Evans wrote:

 http://www.haskell.org/ghc/docs/6.10.2/html/libraries/ghc-prim/GHC-Types.html
 
 contains:
 
 data Int = I# Int#
 
 What does I# Int# mean?  I've tried a simple interpretation:
 
  Prelude GHC.Types I# 5#
 
  interactive:1:5: parse error (possibly incorrect indentation)
  Prelude GHC.Types
 
 but obviously that failed :(
 
 TIA.
 
 -Larry
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: how to terminate an external program after timeout?

2010-09-09 Thread David Peixotto
On Sep 9, 2010, at 6:37 AM, Simon Marlow wrote:

 On 09/09/2010 10:39, Christian Maeder wrote:
 Christian Maeder schrieb:
 Hi,
 
 we call from our haskell application the metis prover via
 
  System.Process.readProcessWithExitCode metis filename 
 
 However, we are not able to get rid of this process if metis does not
 terminate by itself. In particular, wrapping this call into a
 System.Timeout.timeout does not work.
 
 timeout works so far as it is possible to start another action, but the
 continuing metis process still blocks the whole system.
 
 C.
 
 Any suggestions how we should handle this ideally portably but first of
 all under unix. (ghc-6.12.3)
 
 Take a look at the timeout program in GHC's test suite:
 
 http://darcs.haskell.org/testsuite/timeout/timeout.hs

In case it's not obvious, I believe this has to be compiled with -threaded to 
get the desired behavior.

-David

 
 Cheers,
   Simon
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users