Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best.
Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell < tmcdon...@cse.unsw.edu.au> wrote: > Hi Clark, > > The trick is that most accelerate operations work over multidimensional > arrays, so you can still get around the fact that we are limited to flat > data-parallelism only. > > Here is matrix multiplication in Accelerate, lifted from the first Repa > paper [1]. > > > import Data.Array.Accelerate as A > > type Matrix a = Array DIM2 a > > matMul :: (IsNum e, Elt e) => Acc (Matrix e) -> Acc (Matrix e) -> Acc > (Matrix e) > matMul arr brr > = A.fold (+) 0 > $ A.zipWith (*) arrRepl brrRepl > where > Z :. rowsA :. _ = unlift (shape arr) :: Z :. Exp Int :. Exp Int > Z :. _ :. colsB = unlift (shape brr) :: Z :. Exp Int :. Exp Int > > arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) > arr > brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) > (A.transpose brr) > > > If you use github sources rather than the hackage package, those > intermediate replicates will get fused away. > > > Cheers, > -Trev > > [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html > > > > > On 03/12/2012, at 5:07 PM, Clark Gaebel <cgae...@uwaterloo.ca> wrote: > > Hello cafe, > > I've recently started learning about cuda and hetrogenous programming, and > have been using accelerate [1] to help me out. Right now, I'm running into > trouble in that I can't call parallel code from sequential code. Turns out > GPUs aren't exactly like Repa =P. > > Here's what I have so far: > > import qualified Data.Array.Accelerate as A > import Data.Array.Accelerate ( (:.)(..) > , Acc > , Vector > , Scalar > , Elt > , fold > , slice > , constant > , Array > , Z(..), DIM1, DIM2 > , fromList > , All(..) > , generate > , lift, unlift > , shape > ) > import Data.Array.Accelerate.Interpreter ( run ) > > dotP :: (Num a, Elt a) => Acc (Vector a) -> Acc (Vector a) -> Acc (Scalar > a) > dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys > > type Matrix a = Array DIM2 a > > getRow :: Elt a => Int -> Acc (Matrix a) -> Acc (Vector a) > getRow n mat = slice mat . constant $ Z :. n :. All > > -- Naive matrix multiplication: > -- > -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b' > matMul :: A.Acc (Matrix Double) -> A.Acc (Matrix Double) -> A.Acc (Matrix > Double) > matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $ > \ix -> > let (Z :. i :. j) = unlift ix > in getRow i a `dotP` getRow j b > where > b = A.transpose b' -- I assume row indexing is faster than column > indexing... > (Z :. nrows :. _ ) = unlift $ shape a > (Z :. _ :. ncols) = unlift $ shape b > > > This, of course, gives me errors right now because I'm calling getRow and > dotP from within the generation function, which expects Exp[ression]s, not > Acc[elerated computation]s. > > So maybe I need to replace that line with an inner for loop? Is there an > easy way to do that with Accelerate? > > Thanks for your help, > - Clark > > [1] http://hackage.haskell.org/package/accelerate > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > > > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > >
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe