Hey, On Tue, 2013-03-12 at 14:09 +0000, Geoffrey Mainland wrote: > On 03/10/2013 09:52 PM, Nicolas Trangez wrote: > > ... > > Hi Nicolas, > > Have you read our paper about the SIMD work? It's available here: > > https://research.microsoft.com/en-us/um/people/simonpj/papers/ndp/haskell-beats-C.pdf
I didn't read that one before (read other stream-fusion related papers before), but did now. I got most of it already while reading the vector simd branch commits. Benchmarks results look very nice! I'm afraid I didn't 'get' how the framework would allow for both AVX and SSE instructions to work on streams, since it seems to assume Multi's are always a fixed number of bytes wide (in this case 16 for SSE). > The paper describes the issues involved with integrated SIMD > instructions with the vector fusion framework. > > There are two primary issues with alignment: stack alignment and heap > alignment. > > We cannot rely on the stack being properly aligned for AVX spills on any > platform, and LLVM's stack fixup code does not play well with GHC, so we > *rewrite* all AVX spill instructions to their unaligned counterparts. On > Win32 we must do the same for SSE. Does this imply stack values are always 16-byte aligned? I haven't worked with AVX yet (my CPU doesn't support it). > Unboxed vectors are allocated by GHC, and it does not align memory on > 16-byte boundaries, so our first cut at SSE intrinsics simply used > unaligned accesses. Obviously with ForeignPtr's we can control alignment > and potentially use the aligned variants of SSE instructions, but this > will almost double the number of primops. One could imagine extending > our fusion framework to transition to aligned move instructions. Right. I created the patch of #7067 (http://hackage.haskell.org/trac/ghc/ticket/7067) for vector-simd purposed back then (adding mallocForeignPtrAlignedBytes and mallocPlainForeignPtrAlignedBytes). > Finally, LLVM 3.2 does not work with GHC. This means we cannot yet take > advantage of its new vectorization optimizations, which is a shame. > > So, four projects for you or anyone else who is interested, in rough > dependency order: > > 1) Get LLVM 3.2 working with GHC's LLVM back end. According to other mails in this thread this should be fixed. I'll give it a go. > 2) Fix the stack alignment issue with LLVM. This will likely require a > patch to LLVM. I'm afraid that's a bit out of my league for now :-) > 3) Add support for aligned move primops. I looked into this before, might give it a stab. > 4) Extend the current SIMD fusion framework to handle transitioning to > aligned move instructions. As an alternative, only use aligned move > instructions on memory that we know is aligned. This is why I sent my previous mail initially: is there any plan how to approach the 'memory that we know is aligned' bit? Would it make sense to have a more general 'alignment restriction' framework for arbitrary values, not only unboxed vectors (if there are any other use-cases)? > These are all on my todo list, but my plate is quite full at the moment. Heh, sounds familiar ;-) Thanks, Nicolas _______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://www.haskell.org/mailman/listinfo/ghc-devs