I'll take a look at getting the llvm-gcc route going by switching the gct 
variable to use pthread_getspecific() on mac os x. I can do some benchmarking 
to measure the impact.

I was playing around just to get the compilation to succeed. After a small 
change in STGCRun.c, the compile went through but then it was getting a 
segfault in the stage 2 compiler because of the global register variables.

I thought that llvm-gcc would complain about the global register variables, but 
it seems to accept them and generate the assembly code to read and write them. 
Only problem is it will also use these registers for other purposes, so the gct 
was getting stomped which was causing the segfaults.

So from what I can see llvm-gcc dies at compile time when given __thread 
variables and accepts global register variables but can generate code that 
stomps on the register.

-David

On Jun 24, 2011, at 3:23 AM, Simon Marlow wrote:

> On 21/06/2011 05:51, Manuel M T Chakravarty wrote:
>> austin seipp:
>>> (CC'ing Dan so he can chime in, for those who don't IRC.)
>>> 
>>> Dan Knapp (dankna on freenode) is running OS X Lion on his machine
>>> (and corresponding new xcode tools I believe,) and apparently Apple
>>> have gone the whole way in the next release and by default making
>>> 'gcc' a symbolic link to 'llvm-gcc.'
>> 
>> Just like my prediction ;)
>> 
>>> It's likely that will soon be
>>> clang, given llvm-gcc is already deprecated as of LLVM 2.9. There is
>>> still a regular GCC bundled with Lion apparently, ISTR Dan saying the
>>> executable was under /Developer under the name
>>> 'i686-apple-darwin-gcc-4.2' or somesuch, but I can't verify that (Snow
>>> Leopard here.) Anyone with lion want to chime in?
>> 
>> I would assume that 'gcc-4.2' will still point to the traditional GCC for a 
>> while.  Especially with C++, clang is still behind and there are still the 
>> odd code generator bugs in LLVM that require code generation with 
>> traditional gcc.
>> 
>>> Dan was working on build fixes/RTS fixes last week to try and make GHC
>>> build cleanly with the pthread_getspecific and work with compilers
>>> other than GCC. I think he did make some good headway in this area,
>>> but his work isn't done either.
>>> 
>>> Considering global register variables are a rather rare and intricate
>>> GCC extension, it's much more likely that we will see __thread support
>>> in Clang first (TLS also has implications for C++0x I've heard them
>>> say.) It's not on their short-term TODO list, however. In the mean
>>> time if apple were to remove GCC entirely for some reason, we'd still
>>> need Dan's patches, wouldn't we?
>> 
>> If we could move to clang (on OS X) that would be ideal, but as I wrote 
>> above I seriously doubt that Apple will entirely remove gcc (at least not 
>> before whatever cat comes after Lion).  So, for the time being, and until we 
>> can use clang, I think it would be wise to use 'gcc-4.2' as a default on OS 
>> X (instead of 'gcc', which appears to morph into llvm-gcc soon).  If we do 
>> that for GHC 7.2, then GHC 7.2 won't break once Apple flips the sym link 
>> over.
>> 
>> Simon, what do you think?
> 
> I have no strong opinions, you guys know the platform much better then me, so 
> I'm happy to go with whatever you think makes the most sense.
> 
> One thing I would keep an eye on is the performance of the GC, because the 
> handling of the gct thread-local variable is critical.  I can help you with 
> some quick benchmarks if you want to test out changes.
> 
> Cheers,
>       Simon
> 
> 
> 
>> Manuel
>> 
>> 
>>> On Sun, Jun 19, 2011 at 9:43 PM, Manuel M T Chakravarty
>>> <[email protected]>  wrote:
>>>> As llvm-gcc on OS X seems to require some work, I wonder whether we should 
>>>> by default build with the 'gcc-4.2' executable on OS X (which uses the 
>>>> traditional gcc backend), instead of the generic 'gcc' (probably still 
>>>> using 'gcc' as a fallback in configure if 'gcc-4.2' is not available).  
>>>> Then, when Apple makes the switch, binary GHC packages will continue to 
>>>> work.
>>>> 
>>>> Manuel
>>>> 
>>>> PS: I am all for resolving the problems with llvm-gcc, but that will 
>>>> likely take a while.  It'd be good to get a fix into 7.2, though.
>>>> 
>>>> Simon Marlow:
>>>>> On 01/06/2011 13:30, Manuel M T Chakravarty wrote:
>>>>>> Simon Marlow:
>>>>>>> On 01/06/2011 07:11, Manuel M T Chakravarty wrote:
>>>>>>>> Simon Marlow:
>>>>>>>>> On 30/05/2011 14:59, Manuel M T Chakravarty wrote:
>>>>>>>>>> It is no secret that Apple moves away from the traditional GCC
>>>>>>>>>> backend to LLVM.  In fact, Xcode (which bundles all command line
>>>>>>>>>> developer tools on the Mac) today comes with two flavours of gcc:
>>>>>>>>>> 'gcc' and 'llvm-gcc', which AFAIK only differ in the backend that is
>>>>>>>>>> being used.  Currently, the default is the traditional GCC backend,
>>>>>>>>>> but it takes no precognition to realise that this will eventually
>>>>>>>>>> change.  The 'gcc' executable will use the LLVM backend and, at least
>>>>>>>>>> for a while, the traditional backend will still be available under a
>>>>>>>>>> different name.
>>>>>>>>>> 
>>>>>>>>>> Unfortunately, GHC will break at this point as the LLVM backend does
>>>>>>>>>> not support pinned global registers.  ('llvm-gcc' happily accepts the
>>>>>>>>>> register assignment, but fails with a runtime error during code
>>>>>>>>>> generation.)
>>>>>>>>> 
>>>>>>>>> This shouldn't be a problem.  We don't use pinned global registers 
>>>>>>>>> any more, except in one place - the GC (see rts/sm/GCTDecl.h).  There 
>>>>>>>>> it's optional, but you lose a bit of performance by not using a 
>>>>>>>>> pinned register.  It's not a huge deal.
>>>>>>>>> 
>>>>>>>>> Have you tried building GHC with llvm-gcc?  I think I tried it on the 
>>>>>>>>> RTS a year or so ago to check the LLVM output against gcc (LLVM 
>>>>>>>>> wasn't quite as good at the time).
>>>>>>>> 
>>>>>>>> Yes, I tried and it failed, while compiling the RTS, with
>>>>>>>> 
>>>>>>>>        sorry, unimplemented: LLVM cannot handle register variable 
>>>>>>>> ‘R1’, report a bug
>>>>>>>> 
>>>>>>>> This was using the 64bit version of GHC.  I'll have a closer look.
>>>>>>> 
>>>>>>> Perhaps that was when compiling StgCRun.c? It doesn't actually need 
>>>>>>> register variables (on x86_64 at least), but it does include the header 
>>>>>>> files, so that probably needs some #ifdefery somewhere for llvm-gcc.
>>>>>> 
>>>>>> Yes, it's in 'StgCRun.c'.   Ok, and how about on i386 (or do you want
>>>>>> to phase that arch out)?
>>>>> 
>>>>> It doesn't look like the x86 code in StgCRun.c uses registers either. The 
>>>>> sparc version does, but it could be rewritten.
>>>>> 
>>>>>>> The other place, as I mentioned above, is rts/sm/GCTDecl.h, which will 
>>>>>>> need to use a different method for declaring the garbage collector's 
>>>>>>> thread-local state variable, gct.  On x86_64 I found that using a fixed 
>>>>>>> register was the fastest, but using a thread-local variable (the 
>>>>>>> __thread modifier) also works.
>>>>>> 
>>>>>> Just to make sure I understand correctly, are you saying that using a
>>>>>> thread-local variable is already implemented as an option,
>>>>> 
>>>>> Yes - look at the series of #ifdefs in that file, it's pretty 
>>>>> straightforward to change how gct is declared for a particular platform.
>>>>> 
>>>>> However, I've just done some poking around and it seems that __thread is 
>>>>> not supported on OS X:
>>>>> 
>>>>> http://lifecs.likai.org/2010/05/mac-os-x-thread-local-storage.html
>>>>> 
>>>>> see also this thread about Clang:
>>>>> 
>>>>> http://lists.cs.uiuc.edu/pipermail/cfe-dev/2011-March/013673.html
>>>>> 
>>>>> It seems there might be support for __thread in the future, but not in 
>>>>> the short term.
>>>>> 
>>>>> It seems our very own David Peixotto tried building GHC with Clang a year 
>>>>> ago and ran into the same thing:
>>>>> 
>>>>> http://www.dmpots.com/blog/2010/05/08/building-ghc-with-clang.html
>>>>> 
>>>>> So this is less than ideal.  The short term fix would be to #define gct 
>>>>> to be a called to pthread_getspecific().  The call will be inlined - the 
>>>>> OS X headers define pthread_getspecific in terms of some inline assembly, 
>>>>> but the optimiser won't know anything about the inline assembly so it 
>>>>> won't be able to common up multiple loads of gct, and that probably means 
>>>>> it won't perform well.  If that's the case, then the solution is to load 
>>>>> up gct into a temporary in the performance-critical functions in the GC 
>>>>> (evacuate(), scavenge_block()), and add it as an argument to inline 
>>>>> functions.  I'd rather avoid having to do all that if possible.
>>>>> 
>>>>> If you want to benchmark the GC, there are some good programs in nofib/gc.
>>>>> 
>>>>> Cheers,
>>>>>       Simon
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Cvs-ghc mailing list
>>>> [email protected]
>>>> http://www.haskell.org/mailman/listinfo/cvs-ghc
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Regards,
>>> Austin
>> 
> 
> 
> _______________________________________________
> Cvs-ghc mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/cvs-ghc
> 


_______________________________________________
Cvs-ghc mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/cvs-ghc

Reply via email to