On Tue, Oct 11, 2011 at 6:22 PM, Lindsey Kuper <[email protected]> wrote: > Has there ever been any discussion of vectorization (i.e., taking > advantage of LLVM's vector types) in Rust? Patrick said that he'd > brought it up in passing before, but I don't think I've seen it come > up on the mailing list yet. I'm thinking about trying it out for a > class project. I'm at the "looked at the Wikipedia page some" level > of knowledge about vectorization right now, so I have a lot to learn. > Um...thoughts?
So I've been reading more about vectorization. The gold standard seems to be the Allen & Kennedy vectorization algorithm (chapter 2 of http://www.amazon.com/Optimizing-Compilers-Modern-Architectures-Dependence-based/dp/1558602860 -- sadly not free online, although a bootleg PDF can be found if you look hard enough). In order to get the data dependence graph that the vectorization algorithm uses to do its magic, you need to do a data dependence analysis; how to do this is the focus of Chapter 3 (and of the rest of the book, in some sense). So one step in adding vectorization to Rust would be to add a data dependence analysis pass to the compiler. However, it looks like a data dependence analysis is happening in LLVM already. At the very least, there are two passes in the LLVM documentation (http://llvm.org/docs/Passes.html) that have the word "dependence" in their names. The "loop dependence analysis" pass (http://llvm.org/docs/Passes.html#lda) is a likely suspect. Does anyone here know if that pass (or one that uses the result of it) introduces LLVM vector instructions into previously unvectorized IR? If we want vectorization to happen in Rust, should it happen in our LLVM codegen or should it happen during an LLVM pass (or both)? Here's some other interesting stuff I found while digging around: https://llvm.org/svn/llvm-project/poolalloc/tags/RELEASE_14/lib/DSA/Parallelize.cpp -- "This file implements a pass that automatically parallelizes a program, using the Cilk multi-threaded runtime system to execute parallel code. The pass uses the Program Dependence Graph (class PDGIterator) to identify parallelizable function calls, i.e., calls whose instances can be executed in parallel with instances of other function calls." This is the first time I've seen a use of Cilk in the wild, btw. https://llvm.org/svn/llvm-project/poolalloc/tags/RELEASE_14/lib/DSA/PgmDependenceGraph.cpp -- Here's the aforementioned Program Dependence Graph and PDGIterator. I don't know how similar or different any of this is to what Allen & Kennedy present, but it's all apparently part of something called the Automatic Pool Allocator, which presents itself as an LLVM optimization pass (https://llvm.org/svn/llvm-project/poolalloc/trunk/README) and which was the subject of a 2005 PLDI paper (http://llvm.org/pubs/2005-05-21-PLDI-PoolAlloc.pdf) that later became a chapter of Lattner's dissertation. (By the way, the paper contrasts "pools" with regions a little -- from my 30-second glance at it, the gist is something like "they're like regions, but for unsafe languages!".) I'm in far over my head now, so I guess I'm just going to go back to reading Allen & Kennedy and try to understand their algorithm for producing the dependence graph. Comments on any of this are welcome. Lindsey _______________________________________________ Rust-dev mailing list [email protected] https://mail.mozilla.org/listinfo/rust-dev
