Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-16 Thread Alexey Khudyakov
On Sun, Nov 15, 2009 at 8:59 PM, Don Stewart d...@galois.com wrote:
 alexey.skladnoy:
 I found that perfomace of indexU is very poor and it is not fast O(1)
 operation which is very surprising. Here is some benchmarcking I've
 done. Everything compiled with -O2

 You're using the streamed version when its not fusing. Use the
 non-streaming direct implementation exported from Data.Array.Vector.UArr

 This is really an API bug, but I've not had time to sanitize the use.

Probably this should be stated more explicitly in documentation. This is
_very_ unexpected and confusing behaviour.

Also I don't quite understand nature of bug. Is this missing export or
wrong function is exported. And is streamed version of idexU useful
 and in which way?


On Sun, Nov 15, 2009 at 9:11 PM, Thomas DuBuisson
thomas.dubuis...@gmail.com wrote:
 The documentation explicitly says indexU is O(n) - no need for so much
 testing to rediscover that fact.  When I needed a contiguous block of
 values in UArr, I just relied on sliceU to acquire the block and
 performed a foldU.

Problems begin when you need non-contiguous block. Easiest way to so
is indexing.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-16 Thread Roman Leshchinskiy
On 16/11/2009, at 22:46, Alexey Khudyakov wrote:

 Problems begin when you need non-contiguous block. Easiest way to so
 is indexing.

FWIW, this operation is called backpermute and is probably exported as bpermute 
in uvector.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-15 Thread Alexey Khudyakov
Hello

This post meant to be literate haskell.

I found that perfomace of indexU is very poor and it is not fast O(1)
operation which is very surprising. Here is some benchmarcking I've
done. Everything compiled with -O2

Code below converts square 2D array to list of 1D arrays. Summation of
array contents is done in force evaluation

 import Control.Monad
 import Control.Monad.ST
 import Data.Array.Vector
 import System

 arr :: Int - UArr Double
 arr n = toU $ map fromIntegral [1 .. n*n]


This is fastest function. It slice arrays along another direction and used
mainly as upper bound of speed
 sliceY :: Int -  UArr Double - [UArr Double]
 sliceY n a = map (\i - sliceU a (i*n) n) [0 .. n-1]


Naive implementation using lists and index lookup.
  2.15 second for 200*200 array
 sliceXlist :: Int - UArr Double - [UArr Double]
 sliceXlist n a = map mkSlice [0 .. n-1]
 where
   mkSlice x = toU $ map (\y - indexU a (x + y*n)) [0 .. n-1]

Similar implementation in ST monad and it uses indexU too.
  2.14 seconds for 200*200 array
 sliceXst :: Int -   UArr Double - [UArr Double]
 sliceXst n a = map mkSlice [0 .. n-1]
 where
   mkSlice x = runST $ do arr - newMU n
  forM_ [0 .. n-1] $ \y - writeMU arr y (indexU a 
 (y*n + x))
  unsafeFreezeAllMU arr

This implementation avoids use of indexU by copying entire
2D array into mutable array and using it for lookup. Surprisingly
it outperform previsious implementations for sufficiently big n
  1.19 seconds for 200*200 array
 sliceXcopy :: Int -  UArr Double - [UArr Double]
 sliceXcopy n a = map mkSlice [0 .. n-1]
 where
   mkSlice x = runST $ do arr - newMU n
  cp  - newMU (n*n)
  copyMU cp 0 a
  forM_ [0 .. n-1] $ \y - writeMU arr y = 
 readMU cp (y*n + x)
  unsafeFreezeAllMU arr

This is another  implementation with lists which convert whole
array to list and picks appropriate element it. It is fastest implementation
so far.
0.039 seconds for 200*200 array
 sliceXlistfast :: Int - UArr Double - [UArr Double]
 sliceXlistfast n a = map mkSlice [0 .. n-1]
 where
   takeEvery n [] = []
   takeEvery n (x:xs) = x : takeEvery n (drop (n-1) xs)
   mkSlice x = toU $ takeEvery n . drop x $ fromU a




 
 main :: IO ()
 main = do
   [str,a] - getArgs
   let n = read str
   case a of
   y- print $ sum $ map sumU (sliceY n (arr n))
   list - print $ sum $ map sumU (sliceXlist n (arr n))
   lf   - print $ sum $ map sumU (sliceXlistfast n (arr n))
   st   - print $ sum $ map sumU (sliceXst   n (arr n))
   copy - print $ sum $ map sumU (sliceXcopy n (arr n))
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-15 Thread Felipe Lessa
On Sun, Nov 15, 2009 at 03:00:34PM +0300, Alexey Khudyakov wrote:
 Naive implementation using lists and index lookup.
   2.15 second for 200*200 array
  sliceXlist :: Int - UArr Double - [UArr Double]
  sliceXlist n a = map mkSlice [0 .. n-1]
  where
mkSlice x = toU $ map (\y - indexU a (x + y*n)) [0 .. n-1]

Have you tried something like

  mkSlice x = mapU (\y - indexU a (x + y*n)) $ enumFromToU 0 (n-1)

I guess it should be a lot faster :).  Also, I would recomend
using criterion.  Another implementation you may try is

  a' = mapU (\(i :*: x) - (i `mod` n) :*: x) (indexedU a)
  mkSlice j = fstU $ filterU (\(i :*: x) - i == j) a'

HTH,

--
Felipe.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-15 Thread Alexey Khudyakov
On Sun, Nov 15, 2009 at 5:50 PM, Felipe Lessa felipe.le...@gmail.com wrote:
 On Sun, Nov 15, 2009 at 03:00:34PM +0300, Alexey Khudyakov wrote:
 Naive implementation using lists and index lookup.
   2.15 second for 200*200 array
  sliceXlist :: Int - UArr Double - [UArr Double]
  sliceXlist n a = map mkSlice [0 .. n-1]
  where
mkSlice x = toU $ map (\y - indexU a (x + y*n)) [0 .. n-1]

 Have you tried something like

  mkSlice x = mapU (\y - indexU a (x + y*n)) $ enumFromToU 0 (n-1)

 I guess it should be a lot faster :).

No it doesn't. There is no significant difference between two variant above.
I think any program which uses indexU will be slowed to crawl.
Seems like a bug for me.


   Another implementation you may try is

  a' = mapU (\(i :*: x) - (i `mod` n) :*: x) (indexedU a)
  mkSlice j = fstU $ filterU (\(i :*: x) - i == j) a'

This one is fastest so far


 Also, I would recomend using criterion.

I tried to do so.. But it depends on gtk2hs and it is too difficult
to install
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-15 Thread Yusaku Hashimoto
 Also, I would recomend using criterion.

 I tried to do so.. But it depends on gtk2hs and it is too difficult
 to install

You can install with the flag to skip gtk2hs installation. i.e. Try
`cabal install criterion -f-chart`

-~nwn
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-15 Thread Don Stewart
alexey.skladnoy:
 Hello
 
 This post meant to be literate haskell.
 
 I found that perfomace of indexU is very poor and it is not fast O(1)
 operation which is very surprising. Here is some benchmarcking I've
 done. Everything compiled with -O2
 

You're using the streamed version when its not fusing. Use the
non-streaming direct implementation exported from Data.Array.Vector.UArr

This is really an API bug, but I've not had time to sanitize the use.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-15 Thread Felipe Lessa
On Sun, Nov 15, 2009 at 06:16:03PM +0300, Alexey Khudyakov wrote:
Another implementation you may try is
 
   a' = mapU (\(i :*: x) - (i `mod` n) :*: x) (indexedU a)
   mkSlice j = fstU $ filterU (\(i :*: x) - i == j) a'
 

 This one is fastest so far

Nice!  Just for the record, of course I meant 'sndU' :).  Thanks
god Haskell is statically typed and that error should be caught
rather easily.

--
Felipe.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-15 Thread Thomas DuBuisson
The documentation explicitly says indexU is O(n) - no need for so much
testing to rediscover that fact.  When I needed a contiguous block of
values in UArr, I just relied on sliceU to acquire the block and
performed a foldU.

Thomas

On Sun, Nov 15, 2009 at 4:00 AM, Alexey Khudyakov
alexey.sklad...@gmail.com wrote:
 Hello

 This post meant to be literate haskell.

 I found that perfomace of indexU is very poor and it is not fast O(1)
 operation which is very surprising. Here is some benchmarcking I've
 done. Everything compiled with -O2

 Code below converts square 2D array to list of 1D arrays. Summation of
 array contents is done in force evaluation

 import Control.Monad
 import Control.Monad.ST
 import Data.Array.Vector
 import System

 arr :: Int - UArr Double
 arr n = toU $ map fromIntegral [1 .. n*n]


 This is fastest function. It slice arrays along another direction and used
 mainly as upper bound of speed
 sliceY :: Int -  UArr Double - [UArr Double]
 sliceY n a = map (\i - sliceU a (i*n) n) [0 .. n-1]


 Naive implementation using lists and index lookup.
  2.15 second for 200*200 array
 sliceXlist :: Int - UArr Double - [UArr Double]
 sliceXlist n a = map mkSlice [0 .. n-1]
     where
       mkSlice x = toU $ map (\y - indexU a (x + y*n)) [0 .. n-1]

 Similar implementation in ST monad and it uses indexU too.
  2.14 seconds for 200*200 array
 sliceXst :: Int -   UArr Double - [UArr Double]
 sliceXst n a = map mkSlice [0 .. n-1]
     where
       mkSlice x = runST $ do arr - newMU n
                              forM_ [0 .. n-1] $ \y - writeMU arr y (indexU 
 a (y*n + x))
                              unsafeFreezeAllMU arr

 This implementation avoids use of indexU by copying entire
 2D array into mutable array and using it for lookup. Surprisingly
 it outperform previsious implementations for sufficiently big n
  1.19 seconds for 200*200 array
 sliceXcopy :: Int -  UArr Double - [UArr Double]
 sliceXcopy n a = map mkSlice [0 .. n-1]
     where
       mkSlice x = runST $ do arr - newMU n
                              cp  - newMU (n*n)
                              copyMU cp 0 a
                              forM_ [0 .. n-1] $ \y - writeMU arr y = 
 readMU cp (y*n + x)
                              unsafeFreezeAllMU arr

 This is another  implementation with lists which convert whole
 array to list and picks appropriate element it. It is fastest implementation
 so far.
 0.039 seconds for 200*200 array
 sliceXlistfast :: Int - UArr Double - [UArr Double]
 sliceXlistfast n a = map mkSlice [0 .. n-1]
     where
       takeEvery n []     = []
       takeEvery n (x:xs) = x : takeEvery n (drop (n-1) xs)
       mkSlice x = toU $ takeEvery n . drop x $ fromU a




 
 main :: IO ()
 main = do
   [str,a] - getArgs
   let n = read str
   case a of
       y    - print $ sum $ map sumU (sliceY     n (arr n))
       list - print $ sum $ map sumU (sliceXlist n (arr n))
       lf   - print $ sum $ map sumU (sliceXlistfast n (arr n))
       st   - print $ sum $ map sumU (sliceXst   n (arr n))
       copy - print $ sum $ map sumU (sliceXcopy n (arr n))
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe