Re: [Haskell-cafe] Simple GUI Form

2012-12-01 Thread timothyhobbs
I created that for a specific application I was writing, where I was saving 
the data directly to disk.  I modularized as an afterthought.  But I don't 
see anything wrong with using types that have JS in their names.  It is 
STILL a haskell data structure.  Just one which can be converted to a 
relatively universally parse-able String easily when needed ;)

It's fine if you don't use gtk.  I just wanted to show off my code :)

Timothy


-- Původní zpráva --
Od: Rune Harder Bak r...@bak.dk
Datum: 30. 11. 2012
Předmět: Re: [Haskell-cafe] Simple GUI Form

 I know it's not wx, but if you were willing to use GTK, you could simply
 install:

 http://hackage.haskell.org/package/gtk-jsinput
(http://hackage.haskell.org/package/gtk-jsinput)

 and generate the form automatically as described in:

 https://github.com/timthelion/gtk-jsinput/blob/master/Graphics/UI/Gtk/
Custom/JSInput.hs
(https://github.com/timthelion/gtk-jsinput/blob/master/Graphics/UI/Gtk/Custom/JSInput.hs)

Cool!

Would rather see it generated from the haskell data structure than from 
json,
but you could of cause generate json from the data type.

My understand is that GTK applications don't look and feel native on
windows/mac, that's why I went for wx,
but I could easily be wrong.___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to incrementally update list

2012-12-01 Thread Branimir Maksimovic

I want to simulate some calculation that does that.For example n-body 
simulation.Anyway this is solved ;)

 Date: Fri, 30 Nov 2012 13:25:57 -0800
 Subject: Re: [Haskell-cafe] How to incrementally update list
 From: kc1...@gmail.com
 To: bm...@hotmail.com
 CC: haskell-cafe@haskell.org
 
 Why do you want to incrementally update this list a lot of times?
 
 The question would affect the answer you get; i.e. some context
 (non-monadically speaking). :D
 
 
 On Wed, Nov 28, 2012 at 3:43 AM, Branimir Maksimovic bm...@hotmail.com 
 wrote:
  Problem is following short program:
  list = [1,2,3,4,5]
 
  advance l = map (\x - x+1) l
 
  run 0 s = s
  run n s = run (n-1) $ advance s
 
  main = do
  let s =  run 5000 list
  putStrLn $ show s
 
  I want to incrementally update list lot of times, but don't know
  how to do this.
  Since Haskell does not have loops I have to use recursion,
  but problem is that recursive calls keep previous/state parameter
  leading to excessive stack.and memory usage.
  I don't know how to tell Haskell not to keep previous
  state rather to release so memory consumption becomes
  managable.
 
  Is there some solution to this problem as I think it is rather
  common?
 
 
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe
 
 
 
 
 -- 
 --
 Regards,
 KC
  ___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] computation over containers, greatly simplified notation.

2012-12-01 Thread Takayuki Muranushi
Thank you Jason, I have implemented those.
https://github.com/nushio3/practice/blob/master/free-objects/zipn-03.hs

I implemented what I have wanted. It is forZN in the following code.

https://github.com/nushio3/practice/blob/master/free-objects/zipf-05.hs

forZN, much like printf, can be used in place of any of the following
functions.

forLiftZ1 :: Zip f = f a - (a - b) - f bforLiftZ2 :: Zip f = f a
- f b - (a - b - c) - f c
forLiftZ3 :: Zip f = f a - f b - f c - (a - b - c - d) - f d

...and more...


The last example in above code is the usecase of forZN in my mind: to zip
several arrays with a long lambda. This pattern occurs frequently in my
codes.

One drawback of current approach is that we need verbose type annotations
like
 :: V.Vector String in   print $ (forZN vd1 vc1 vi1 f_dci_s :: V.Vector
String) .

Maybe this is because PType carries insufficient type-level information,
compared to Reduce and Insert. Now I'm now wondering if we can use
ghc-7.6.1's rich kind features such as type-level lists to remove those
verbose type annotations.




2012/12/1 Jason Dagit dag...@gmail.com

 You might find this paper an interesting read:
 http://www.brics.dk/RS/01/10/

 On Fri, Nov 30, 2012 at 4:43 PM, Takayuki Muranushi 
 muranu...@gmail.comwrote:

 Dear everyone,

 After a number of attempts [1] I'm starting to think that my initial
 approach was ill-directed.

 After all, Functor, Applicative, Zip are three different classes.
 Functors are type constructors where you can map unary functions over
 them.
 Applicatives are those with map-over of zero-ary functions (pure,) unary
 functions, binary functions, ternary functions, ... etc.
 Zip are those with unary, binary, ternary ... mapover, but not zero-ary
 map-over.

 Repa Arrays and Vectors belong to Zip because there's no trivial unique
 way to implement pure.

 What the customer really needed [2] seems to be the following series of
 functions:


 forLiftZ1 :: Zip f = f a - (a - b) - f b



 forLiftZ2 :: Zip f = f a - f b - (a - b - c) - f c




 forLiftZ3 :: Zip f = f a - f b - f c - (a - b - c - d) - f d



 Now I'm trying if it's possible to implement the series in a single shot
 [3] .

 I'm reporting my progress for anyone who might be still thinking for me.
 Thank you!!

 [1] https://github.com/nushio3/practice/tree/master/free-objects
 [2] http://www.projectcartoon.com/cartoon/3
 [3]
 https://github.com/nushio3/practice/blob/master/free-objects/printf6.hs




 2012/11/29 Takayuki Muranushi muranu...@gmail.com

 Dear all,

 I came up with an idea to greatly simplify some kinds of array
 computations. It should work well with many kinds of arrays. Is this new?

 https://gist.github.com/4162375



 These few days, I've been trying to rewrite a hydrodynamic simulation
 code that used Data.Vector (~250 lines), to Repa [1] . It seemed promising,
 but soon I realized that I needed to use Repa.map and Repa.zipWith almost
 everywhere. I need careful thinking to transform every lines (that used
 vector's indexing) to Repa's point-free stile. Is there any better ways?

 Then I realized that I was the author of Paraiso [2], a DSL for stencil
 computation. One of its feature is to write down array computation just as
 if it were scalar computation.

 Basically what I need is ZipList-like Applicative instances for vectors
 and Repa arrays. Why not they support ZipVector? Because 'pure' of zipList
 was an infinite list and you can't do infinite vectors. Then I came up with
 this idea.

 https://gist.github.com/4162375

 the wrapper W does several things: it represents the 'pure,' homogeneous
 array in a space-efficient manner, and also serves as a newtype-wrapper of
 Num (and possibly Fractional, Floating...) instances.

 Questions are: is this technology new? or promising? doomed?
 It seems to me like a free-Applicative, like the free-Monad theory. Are
 they related?
 The function 'backend' helps to mix in the non-zip-like computations.
 How can we remove the 'undefined' in the 'backend?'
 Some of Repa computations are Monads. W needs to be a monad transformer
 to incooperate this.

 Also I'm grateful to past cafe discussion on existing Zippable
 implementations [3][4] .

 [1] hackage.haskell.org/package/repa
 [2] http://hackage.haskell.org/package/Paraiso
 [3] http://www.haskell.org/pipermail/haskell-cafe/2009-July/064403.html
 [4]
 http://hackage.haskell.org/packages/archive/category-extras/latest/doc/html/Control-Functor-Zip.html


 --
 Takayuki MURANUSHI
 The Hakubi Center for Advanced Research, Kyoto University
 http://www.hakubi.kyoto-u.ac.jp/02_mem/h22/muranushi.html




 --
 Takayuki MURANUSHI
 The Hakubi Center for Advanced Research, Kyoto University
 http://www.hakubi.kyoto-u.ac.jp/02_mem/h22/muranushi.html

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe





-- 
Takayuki MURANUSHI
The Hakubi Center for Advanced Research, Kyoto University

Re: [Haskell-cafe] lambda case

2012-12-01 Thread Roman Cheplyaka
It hasn't made it to the standard yet, though. If some experimental
feature is implemented in GHC, it doesn't mean it's set in stone.

I find this discussion useful — there are some interesting points
(splitting case of into two parts) that I don't remember reading in the
original thread (but maybe it's just me).

Roman

* Brent Yorgey byor...@seas.upenn.edu [2012-11-30 09:52:53-0500]
 Oh, PLEASE people.  Let's not have another round of bikeshedding about
 this AFTER the feature is already implemented!
 
 -Brent
 
 On Fri, Nov 30, 2012 at 01:25:27PM +0100, Herbert Valerio Riedel wrote:
  Jon Fairbairn jon.fairba...@cl.cam.ac.uk writes:
  
  [...]
  
   “\case” complicates lambda, using “of” simply breaks “case … of …”
   into two easily understood parts.
  
  Just some observation (I'm rather late to the lambda-case discussion, so
  this might have been already pointed out previously):
  
  if the reserved keyword 'of' was to take the place of '\case', shouldn't
  then
  
'case' exp
  
  w/o the 'of' { alts }-part become a separately valid expression (with
  'case' essentially meaning 'flip ($)') to really break it up into two
  independent parts? Then 'case exp of { alts }' wouldn't be a special
  form anymore, but would just result from combining 'case' and 'of';
  
  'case' wouldn't even need to be a reserved keyword (and thus the grammar
  could be simplified), if it wasn't for the current grammar which
  requires to isolate a \case-expression by using () or $, consider e.g.:
  
{-# LANGUAGE LambdaCase #-}

import System.Environment

case' :: b - (b - c) - c
case' = flip ($)

main = do
  s - getArgs

  case' s $ \case  -- image '\case' was actually '\of' or 'of'
[x] - putStrLn (Hello  ++ x)
_   - putStrLn wrong number of arguments given
  
  
  just my 2¢
  
  cheers,
hvr
  
  ___
  Glasgow-haskell-users mailing list
  glasgow-haskell-us...@haskell.org
  http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
  
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Are storable-mutable-vectors in two dimensions possible in Haskell?

2012-12-01 Thread Fixie Fixie
Hello haskellers

I am struggling with this package: Data.Vector.Storable.Mutable

I am creating vectors like this: MV.new 1000 :: IO (V.MVector (PrimState IO) 
Int)

Now I would like access to this vectors in linear time, like I could have done 
in C using pointers.

The problem is that I do not know which datastructure to keep my vectors in. If 
I put them in a list, the access is very slow. Even a binary tree is very slow 
compared to a C-version.

I have tried to make an array og vector of storable-mutable vectors, but have 
not manged to accomplish this - even though there must be a haskell way :-)

Any clues?

Cheers,

Felix___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Are storable-mutable-vectors in two dimensions possible in Haskell?

2012-12-01 Thread Dmitry Dzhus
01.12.2012, 15:14, Fixie Fixie fixie.fi...@rocketmail.com:
 Hello haskellers
 I am struggling with this package: Data.Vector.Storable.Mutable
 I am creating vectors like this: MV.new 1000 :: IO (V.MVector (PrimState IO) 
 Int)
 Now I would like access to this vectors in linear time, like I could have 
 done in C using pointers.
 The problem is that I do not know which datastructure to keep my vectors in. 
 If I put them in a list, the access is very slow. Even a binary tree is very 
 slow compared to a C-version.
 I have tried to make an array og vector of storable-mutable vectors, but have 
 not manged to accomplish this - even though there must be a haskell way :-)

You can use boxed vector as a first-level structure holding pointers
you your storable vectors to perform O(1) indexing on first dimension.

Alternatively, if your matrix is dense, store your data in a
one-dimensional vector and use Repa as a thin wrapper to perform
multi-dimensional indexing (see toIndex/fromIndex
Data.Array.Repa.Shape).

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Vedr: Are storable-mutable-vectors in two dimensions possible in Haskell?

2012-12-01 Thread Fixie Fixie
Sounds good. I thought something like this could be the solution.

But it might be that my type-knowledge is not good enough - but I have not been 
able to do this. Example:

vec1 - MV.new 1000 :: IO (V.MVector (PrimState IO) Int)



Data.Vector.singleton vec1 -- does not work, gives a lot of type-errs.

Felix




 Fra: Dmitry Dzhus d...@dzhus.org
Til: Fixie Fixie fixie.fi...@rocketmail.com; Haskell cafe 
haskell-cafe@haskell.org 
Sendt: Lørdag, 1. desember 2012 12.27
Emne: Re: [Haskell-cafe] Are storable-mutable-vectors in two dimensions 
possible in Haskell?
 
01.12.2012, 15:14, Fixie Fixie fixie.fi...@rocketmail.com:
 Hello haskellers
 I am struggling with this package: Data.Vector.Storable.Mutable
 I am creating vectors like this: MV.new 1000 :: IO (V.MVector (PrimState IO) 
 Int)
 Now I would like access to this vectors in linear time, like I could have 
 done in C using pointers.
 The problem is that I do not know which datastructure to keep my vectors in. 
 If I put them in a list, the access is very slow. Even a binary tree is very 
 slow compared to a C-version.
 I have tried to make an array og vector of storable-mutable vectors, but have 
 not manged to accomplish this - even though there must be a haskell way :-)

You can use boxed vector as a first-level structure holding pointers
you your storable vectors to perform O(1) indexing on first dimension.

Alternatively, if your matrix is dense, store your data in a
one-dimensional vector and use Repa as a thin wrapper to perform
multi-dimensional indexing (see toIndex/fromIndex
Data.Array.Repa.Shape).___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is anyone working on a sparse matrix library in Haskell?

2012-12-01 Thread mukesh tiwari
Hi Mark
To continue with library I just wrote a quick recursive matrix
multiplication. Since you mentioned about Strassen's algorithm so I went to
wikipedia ( http://en.wikipedia.org/wiki/Strassen_algorithm ) and wrote the
recursive algorithm using 4 multiplication but it's not very hard to modify
this to Strassen's algorithm. You can see code (
https://github.com/mukeshtiwari/Puzzles/blob/master/recursivematrixmultiplication/Matmultiplication.hs).
It's just quick code and it has lot of chance for improvement like
using
Data Parallel Haskell ) , some parallelism using Simon's monad-par library
or distributed parallelism using Cloud Haskell and sparse matrix
representation consideration.

Mukesh Tiwari



On Sat, Dec 1, 2012 at 3:28 AM, Mark Flamer m...@flamerassoc.com wrote:

 Thanks for all the replies,
  It sounds like there is enough interest and even some potential
 collaborators out there. I have created a few data structures to represent
 sparse vectors and matrices. The vector was a simple binary tree and the
 matrix a quad tree. As I suspected a standard IntMap was around 3X as fast
 as my binary tree, so I have switched to the IntMap for now. I was hoping
 to
 hold on to my quad tree for the matrix rep. because I like the elegance of
 the recursive algorithms like Strassen’s for multiplication. In the end it
 will most likely be best to have a few different matrix representations
 anyway. For instance, I know that compressed row form is most efficient for
 certain algorithms. I have a Gauss iterative solver working currently and
 am
 thinking of moving on to a parallel Conjugate gradient or direct solver
 using LU decomposition. I’m no expert in Haskell or numeric methods but I
 do
 my best. I’ll clean up what I have and open up the project on Github or
 Bitbucket. Any other thoughts or ideas are welcome.
 I’m currently using the Matrix Market files for testing. It would be nice
 to
 benchmark this against an industry standard solver in C or Fortran, without
 having to do a lot of work to set it up. Does anyone know of an easy way to
 do this? I’m just trying to get results to compare in orders of magnitude
 for now. It would be motivating to see these comparisons.




 --
 View this message in context:
 http://haskell.1045720.n5.nabble.com/Is-anyone-working-on-a-sparse-matrix-library-in-Haskell-tp5721452p5721556.html
 Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] lambda case

2012-12-01 Thread Brandon Allbery
On Sat, Dec 1, 2012 at 5:30 AM, Roman Cheplyaka r...@ro-che.info wrote:

 I find this discussion useful — there are some interesting points
 (splitting case of into two parts) that I don't remember reading in the
 original thread (but maybe it's just me).


Mentioned twice that I recall, as treating 'of' as a lambda and as '\of'.
 It got somewhat short shrift, likely because while it makes sense from an
existing language syntax viewpoint, it makes little to none from a
readability standpoint.

-- 
brandon s allbery kf8nh   sine nomine associates
allber...@gmail.com  ballb...@sinenomine.net
unix, openafs, kerberos, infrastructure, xmonadhttp://sinenomine.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] computation over containers, greatly simplified notation.

2012-12-01 Thread Takayuki Muranushi
Dear all,

https://github.com/nushio3/practice/blob/master/free-objects/zipf-12.hs

I was finally able to remove the verbose type annotations.

The point was (1) to let the argument-list carry the type-constructor
information, so that only values of the type (v a) can enter the
heterogeneous list;

data Cons (v :: * - *) a b = Cons a b deriving (Eq, Show)
data Nil  (v :: * - *) = Nil  deriving (Eq, Show)

(2) to change the remainder of the program accordingly, and (3) to add
the following subtle modification, because instance finder does not
attempt to match unknown type a0 to v result, but it does so to
result.


class Reduce v f vxS result vyS | v f vxS - result vyS where
  reduce :: v f - vxS - (v result, vyS)
   ↓
class Reduce v f vxS result vyS | v f vxS - result vyS where
  reduce :: v f - vxS - (result, vyS)


Other samples work with modest type annotations.
https://github.com/nushio3/practice/blob/master/free-objects/zipf-13.hs


2012/12/1 Takayuki Muranushi muranu...@gmail.com

 Thank you Jason, I have implemented those. 
 https://github.com/nushio3/practice/blob/master/free-objects/zipn-03.hs

 I implemented what I have wanted. It is forZN in the following code.

 https://github.com/nushio3/practice/blob/master/free-objects/zipf-05.hs

 forZN, much like printf, can be used in place of any of the following 
 functions.

 forLiftZ1 :: Zip f = f a - (a - b) - f b
 forLiftZ2 :: Zip f = f a - f b - (a - b - c) - f c
 forLiftZ3 :: Zip f = f a - f b - f c - (a - b - c - d) - f
 d

 ...and more...


 The last example in above code is the usecase of forZN in my mind: to zip 
 several arrays with a long lambda. This pattern occurs frequently in my codes.

 One drawback of current approach is that we need verbose type annotations like
  :: V.Vector String in   print $ (forZN vd1 vc1 vi1 f_dci_s :: V.Vector 
 String) .

 Maybe this is because PType carries insufficient type-level information, 
 compared to Reduce and Insert. Now I'm now wondering if we can use 
 ghc-7.6.1's rich kind features such as type-level lists to remove those 
 verbose type annotations.





 2012/12/1 Jason Dagit dag...@gmail.com

 You might find this paper an interesting read: http://www.brics.dk/RS/01/10/

 On Fri, Nov 30, 2012 at 4:43 PM, Takayuki Muranushi muranu...@gmail.com 
 wrote:

 Dear everyone,

 After a number of attempts [1] I'm starting to think that my initial 
 approach was ill-directed.

 After all, Functor, Applicative, Zip are three different classes.
 Functors are type constructors where you can map unary functions over them.
 Applicatives are those with map-over of zero-ary functions (pure,) unary 
 functions, binary functions, ternary functions, ... etc.
 Zip are those with unary, binary, ternary ... mapover, but not zero-ary 
 map-over.

 Repa Arrays and Vectors belong to Zip because there's no trivial unique way 
 to implement pure.

 What the customer really needed [2] seems to be the following series of 
 functions:


 forLiftZ1 :: Zip f = f a - (a - b) - f b




 forLiftZ2 :: Zip f = f a - f b - (a - b - c) - f c





 forLiftZ3 :: Zip f = f a - f b - f c - (a - b - c - d) - f d





 Now I'm trying if it's possible to implement the series in a single shot 
 [3] .

 I'm reporting my progress for anyone who might be still thinking for me. 
 Thank you!!

 [1] https://github.com/nushio3/practice/tree/master/free-objects
 [2] http://www.projectcartoon.com/cartoon/3
 [3] https://github.com/nushio3/practice/blob/master/free-objects/printf6.hs




 2012/11/29 Takayuki Muranushi muranu...@gmail.com

 Dear all,

 I came up with an idea to greatly simplify some kinds of array 
 computations. It should work well with many kinds of arrays. Is this new?

 https://gist.github.com/4162375



 These few days, I've been trying to rewrite a hydrodynamic simulation code 
 that used Data.Vector (~250 lines), to Repa [1] . It seemed promising, but 
 soon I realized that I needed to use Repa.map and Repa.zipWith almost 
 everywhere. I need careful thinking to transform every lines (that used 
 vector's indexing) to Repa's point-free stile. Is there any better ways?

 Then I realized that I was the author of Paraiso [2], a DSL for stencil 
 computation. One of its feature is to write down array computation just as 
 if it were scalar computation.

 Basically what I need is ZipList-like Applicative instances for vectors 
 and Repa arrays. Why not they support ZipVector? Because 'pure' of zipList 
 was an infinite list and you can't do infinite vectors. Then I came up 
 with this idea.

 https://gist.github.com/4162375

 the wrapper W does several things: it represents the 'pure,' homogeneous 
 array in a space-efficient manner, and also serves as a newtype-wrapper of 
 Num (and possibly Fractional, Floating...) instances.

 Questions are: is this technology new? or promising? doomed?
 It seems to me like a free-Applicative, like the free-Monad theory. Are 
 they related?
 The function 'backend' helps to mix in the 

[Haskell-cafe] Why is boxed mutable array so slow?

2012-12-01 Thread Branimir Maksimovic

I have made benchmark test inspired by 
http://lemire.me/blog/archives/2012/07/23/is-cc-worth-it/
What surprised me is that unboxed array is much faster than boxed 
array.Actually boxed array performance is on par with standard Haskell 
listwhich is very slow.All in all even unboxed array is about 10 times slower 
than Java version.I don't understand why is even unboxed array so slow.But! 
unboxed array consumes least amount of RAM.(warning, program consumes more than 
3gb of ram)
 bmaxa@maxa:~/examples$ time ./Cumulboxed arraylast 262486571 seconds 
4.972unboxed arraylast 262486571 seconds 0.776listlast 262486571 seconds 6.812
real0m13.086suser0m11.996ssys 0m1.080s
-{-# 
LANGUAGE CPP, BangPatterns #-}import System.CPUTimeimport Text.Printfimport 
Data.Array.IOimport Data.Array.Baseimport Data.Intimport Control.DeepSeqimport 
System.Mem
main :: IO()main = do   (newArray_ (0,n'-1) :: IO(A)) = test boxed array
performGC   (newArray_ (0,n'-1) :: IO(B)) = test unboxed array  
performGC   begin - getCPUTime printf list\nlast %d $ last $ force $ 
take n' $ sum' data'end - getCPUTime   let diff = (fromIntegral (end - 
begin)) / (10^12)   printf  seconds %.3f\n (diff::Double)
test s a = do   putStrLn s  begin - getCPUTime init' a partial_sum a   
end - getCPUTime   let diff = (fromIntegral (end - begin)) / (10^12)   
last - readArray a (n'-1)  printf last %d seconds %.3f\n last 
(diff::Double)
n' :: Intn' = 50 * 1000 * 1000
type A = IOArray Int Int32type B = IOUArray Int Int32
init' a = do(_,n) - getBounds ainit a 0 n  where   init a 
k n  | k  n = return () | otherwise = 
dolet  !x = fromIntegral $ k + k `div` 3
  unsafeWrite a k x   init a (k+1) n
partial_sum a = do  (_,n) - getBounds a
k - unsafeRead a 0 ps a 1 n k  
where   ps a i n s  
| i  n = return () 
| otherwise = do
k - unsafeRead a i 
let !l = fromIntegral $ s + k   
unsafeWrite a i l   
ps a (i+1) n l
data' :: [Int32]data' = [k + k `div` 3 | k - [0..] ]
sum' = scanl1 (+)
  ___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Why is boxed mutable array so slow?

2012-12-01 Thread KC
Boxed arrays have a wrapper (extra layer of indirection) to allow for
a fully evaluated value, an unevaluated thunk, or the special value
bottom (a value that can contain bottom is referred to as lifted).

Unboxed arrays always have some value; that is, they cannot represent
a thunk nor bottom.


On Sat, Dec 1, 2012 at 8:09 AM, Branimir Maksimovic bm...@hotmail.com wrote:
 I have made benchmark test inspired by
 http://lemire.me/blog/archives/2012/07/23/is-cc-worth-it/

 What surprised me is that unboxed array is much faster than boxed array.
 Actually boxed array performance is on par with standard Haskell list
 which is very slow.
 All in all even unboxed array is about 10 times slower than Java version.
 I don't understand why is even unboxed array so slow.
 But! unboxed array consumes least amount of RAM.
 (warning, program consumes more than 3gb of ram)

  bmaxa@maxa:~/examples$ time ./Cumul
 boxed array
 last 262486571 seconds 4.972
 unboxed array
 last 262486571 seconds 0.776
 list
 last 262486571 seconds 6.812

 real0m13.086s
 user0m11.996s
 sys 0m1.080s

 -
 {-# LANGUAGE CPP, BangPatterns #-}
 import System.CPUTime
 import Text.Printf
 import Data.Array.IO
 import Data.Array.Base
 import Data.Int
 import Control.DeepSeq
 import System.Mem

 main :: IO()
 main = do
 (newArray_ (0,n'-1) :: IO(A)) = test boxed array
 performGC
 (newArray_ (0,n'-1) :: IO(B)) = test unboxed array
 performGC
 begin - getCPUTime
 printf list\nlast %d $ last $ force $ take n' $ sum' data'
 end - getCPUTime
 let diff = (fromIntegral (end - begin)) / (10^12)
 printf  seconds %.3f\n (diff::Double)

 test s a = do
 putStrLn s
 begin - getCPUTime
 init' a
 partial_sum a
 end - getCPUTime
 let diff = (fromIntegral (end - begin)) / (10^12)
 last - readArray a (n'-1)
 printf last %d seconds %.3f\n last (diff::Double)

 n' :: Int
 n' = 50 * 1000 * 1000

 type A = IOArray Int Int32
 type B = IOUArray Int Int32

 init' a = do
 (_,n) - getBounds a
 init a 0 n
 where
 init a k n
 | k  n = return ()
 | otherwise = do
 let  !x = fromIntegral $ k + k `div` 3
 unsafeWrite a k x
 init a (k+1) n

 partial_sum a = do
 (_,n) - getBounds a
 k - unsafeRead a 0
 ps a 1 n k
 where
 ps a i n s
 | i  n = return ()
 | otherwise = do
 k - unsafeRead a i
 let !l = fromIntegral $ s + k
 unsafeWrite a i l
 ps a (i+1) n l

 data' :: [Int32]
 data' = [k + k `div` 3 | k - [0..] ]

 sum' = scanl1 (+)


 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe




-- 
--
Regards,
KC

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Why is boxed mutable array so slow?

2012-12-01 Thread Don Stewart
The obvious difference between boxed and unboxed arrays is that the
boxed arrays are full of pointers to heap allocated objects. This
means you pay indirection to access the values, much more time in GC
spent chasing pointers (though card marking helps), and generally do
more allocation.

Compare the GC stats below, for

* Boxed vector: 88M bytes copied; 75% of time in GC, 0.472s
* Unboxed vector: 11k bytes copied, 1.3% of time in GC, 0.077s

So there's your main answer. The increased data density of unboxed
arrays also helps a too.

Now, you can help out  the GC signifcantly by hinting at how much
you're going to allocated in the youngest generation (see the
ghc-gc-tune app for a methodical approach to this, though it needs
updating to ghc 7 --
http://donsbot.wordpress.com/2010/07/05/ghc-gc-tune-tuning-haskell-gc-settings-for-fun-and-profit/
 and 
http://stackoverflow.com/questions/3171922/ghcs-rts-options-for-garbage-collection
).

Use the +RTS -A flag to set an initial youngest generation heap size
to the size of your array, and watch the GC cost disappear. For our
boxed vector, we'd use +RTS -A50M, resulting in:

* Boxed vector: 8k copied, 1% of time in GC, 0.157s

So not bad. 3x speedup through a RTS flag. -A is very useful if you
are working with boxed, mutable arrays.

For reference, there's a generic version below that specializes based
on the vector type parameter.

-

{-# LANGUAGE BangPatterns #-}

import System.CPUTime
import Text.Printf
import Data.Int
import Control.DeepSeq
import System.Mem

import qualified Data.Vector.Mutable as V
import qualified Data.Vector.Unboxed.Mutable as U
import qualified Data.Vector.Generic.Mutable as G

main :: IO()
main = do

--   (G.new n' :: IO (V.IOVector Int32)) = test' boxed vector
--   performGC
   (G.new n' :: IO (U.IOVector Int32)) = test' unboxed vector
   performGC

test' s a = do
putStrLn s
begin - getCPUTime
init'' a
partial_sum' a
end - getCPUTime
let diff = (fromIntegral (end - begin)) / (10**12)
last - G.read a (n'-1)
printf last %d seconds %.3f\n last (diff::Double)

n' :: Int
n' = 1000 * 1000

init'' !a = init 0 (n'-1)
  where
init :: Int - Int - IO ()
init !k !n
| k  n = return ()
| otherwise = do
let !x = fromIntegral $ k + k `div` 3
G.write a k x
init (k+1) n



partial_sum' !a = do
k - G.read a 0
ps 1 (n'-1) k
  where
ps :: Int - Int - Int32 - IO ()
ps i n s
| i  n = return ()
| otherwise = do
k - G.read a i
let !l = fromIntegral $ s + k
G.write a i l
ps (i+1) n l


-

$ time ./A +RTS -s
boxed vector
last 945735787 seconds 0.420
  40,121,448 bytes allocated in the heap
  88,355,272 bytes copied during GC
  24,036,456 bytes maximum residency (6 sample(s))
 380,632 bytes maximum slop
  54 MB total memory in use (0 MB lost due to fragmentation)

  %GC time  75.2%  (75.9% elapsed)

  Alloc rate359,655,602 bytes per MUT second

./A +RTS -s  0.40s user 0.07s system 98% cpu 0.475 total


$ time ./A +RTS -s
unboxed vector
last 945735787 seconds 0.080
   4,113,568 bytes allocated in the heap
  11,288 bytes copied during GC
   4,003,256 bytes maximum residency (3 sample(s))
 182,856 bytes maximum slop
   5 MB total memory in use (0 MB lost due to fragmentation)

  %GC time   1.3%  (1.3% elapsed)

  Alloc rate51,416,660 bytes per MUT second

./A +RTS -s  0.08s user 0.01s system 98% cpu 0.088 total


$ time ./A +RTS -A50M -s
boxed vector
last 945735787 seconds 0.127
  40,121,504 bytes allocated in the heap
   8,032 bytes copied during GC
  44,704 bytes maximum residency (2 sample(s))
  20,832 bytes maximum slop
  59 MB total memory in use (0 MB lost due to fragmentation)

  %GC time   1.0%  (1.0% elapsed)

  Productivity  97.4% of total user, 99.6% of total elapsed

./A +RTS -A50M -s  0.10s user 0.05s system 97% cpu 0.157 total



-


On Sat, Dec 1, 2012 at 11:09 AM, Branimir Maksimovic bm...@hotmail.com wrote:
 I have made benchmark test inspired by
 http://lemire.me/blog/archives/2012/07/23/is-cc-worth-it/

 What surprised me is that unboxed array is much faster than boxed array.
 Actually boxed array performance is on par with standard Haskell list
 which is very slow.
 All in all even unboxed array is about 10 times slower than Java version.
 I don't understand why is even unboxed array so slow.
 But! unboxed array consumes least amount of RAM.
 (warning, program consumes more than 3gb of ram)

  bmaxa@maxa:~/examples$ time ./Cumul
 boxed array
 last 262486571 seconds 4.972
 unboxed array
 last 262486571 seconds 0.776
 list
 last 262486571 seconds 6.812

 real0m13.086s
 user0m11.996s
 sys 0m1.080s

 

Re: [Haskell-cafe] Why is boxed mutable array so slow?

2012-12-01 Thread Branimir Maksimovic

Wow that sped it up 5 times.I see that boxed Vector is 25% faster than 
IOArray.What is the difference and when to use Vector,when IOArray?Thanks!
bmaxa@maxa:~/examples$ time ./Cumul +RTS -A1600Mboxed arraylast 262486571 
seconds 1.196unboxed arraylast 262486571 seconds 0.748boxed vectorlast 
262486571 seconds 0.908unboxed vectorlast 262486571 seconds 0.720
real0m3.805suser0m3.428ssys 0m0.372s

 Date: Sat, 1 Dec 2012 12:20:37 -0500
 Subject: Re: [Haskell-cafe] Why is boxed mutable array so slow?
 From: don...@gmail.com
 To: bm...@hotmail.com
 CC: haskell-cafe@haskell.org
 
 The obvious difference between boxed and unboxed arrays is that the
 boxed arrays are full of pointers to heap allocated objects. This
 means you pay indirection to access the values, much more time in GC
 spent chasing pointers (though card marking helps), and generally do
 more allocation.
 
 Compare the GC stats below, for
 
 * Boxed vector: 88M bytes copied; 75% of time in GC, 0.472s
 * Unboxed vector: 11k bytes copied, 1.3% of time in GC, 0.077s
 
 So there's your main answer. The increased data density of unboxed
 arrays also helps a too.
 
 Now, you can help out  the GC signifcantly by hinting at how much
 you're going to allocated in the youngest generation (see the
 ghc-gc-tune app for a methodical approach to this, though it needs
 updating to ghc 7 --
 http://donsbot.wordpress.com/2010/07/05/ghc-gc-tune-tuning-haskell-gc-settings-for-fun-and-profit/
  and 
 http://stackoverflow.com/questions/3171922/ghcs-rts-options-for-garbage-collection
 ).
 
 Use the +RTS -A flag to set an initial youngest generation heap size
 to the size of your array, and watch the GC cost disappear. For our
 boxed vector, we'd use +RTS -A50M, resulting in:
 
 * Boxed vector: 8k copied, 1% of time in GC, 0.157s
 
 So not bad. 3x speedup through a RTS flag. -A is very useful if you
 are working with boxed, mutable arrays.
 
 For reference, there's a generic version below that specializes based
 on the vector type parameter.
 
 -
 
 {-# LANGUAGE BangPatterns #-}
 
 import System.CPUTime
 import Text.Printf
 import Data.Int
 import Control.DeepSeq
 import System.Mem
 
 import qualified Data.Vector.Mutable as V
 import qualified Data.Vector.Unboxed.Mutable as U
 import qualified Data.Vector.Generic.Mutable as G
 
 main :: IO()
 main = do
 
 --   (G.new n' :: IO (V.IOVector Int32)) = test' boxed vector
 --   performGC
(G.new n' :: IO (U.IOVector Int32)) = test' unboxed vector
performGC
 
 test' s a = do
 putStrLn s
 begin - getCPUTime
 init'' a
 partial_sum' a
 end - getCPUTime
 let diff = (fromIntegral (end - begin)) / (10**12)
 last - G.read a (n'-1)
 printf last %d seconds %.3f\n last (diff::Double)
 
 n' :: Int
 n' = 1000 * 1000
 
 init'' !a = init 0 (n'-1)
   where
 init :: Int - Int - IO ()
 init !k !n
 | k  n = return ()
 | otherwise = do
 let !x = fromIntegral $ k + k `div` 3
 G.write a k x
 init (k+1) n
 
 
 
 partial_sum' !a = do
 k - G.read a 0
 ps 1 (n'-1) k
   where
 ps :: Int - Int - Int32 - IO ()
 ps i n s
 | i  n = return ()
 | otherwise = do
 k - G.read a i
 let !l = fromIntegral $ s + k
 G.write a i l
 ps (i+1) n l
 
 
 -
 
 $ time ./A +RTS -s
 boxed vector
 last 945735787 seconds 0.420
   40,121,448 bytes allocated in the heap
   88,355,272 bytes copied during GC
   24,036,456 bytes maximum residency (6 sample(s))
  380,632 bytes maximum slop
   54 MB total memory in use (0 MB lost due to fragmentation)
 
   %GC time  75.2%  (75.9% elapsed)
 
   Alloc rate359,655,602 bytes per MUT second
 
 ./A +RTS -s  0.40s user 0.07s system 98% cpu 0.475 total
 
 
 $ time ./A +RTS -s
 unboxed vector
 last 945735787 seconds 0.080
4,113,568 bytes allocated in the heap
   11,288 bytes copied during GC
4,003,256 bytes maximum residency (3 sample(s))
  182,856 bytes maximum slop
5 MB total memory in use (0 MB lost due to fragmentation)
 
   %GC time   1.3%  (1.3% elapsed)
 
   Alloc rate51,416,660 bytes per MUT second
 
 ./A +RTS -s  0.08s user 0.01s system 98% cpu 0.088 total
 
 
 $ time ./A +RTS -A50M -s
 boxed vector
 last 945735787 seconds 0.127
   40,121,504 bytes allocated in the heap
8,032 bytes copied during GC
   44,704 bytes maximum residency (2 sample(s))
   20,832 bytes maximum slop
   59 MB total memory in use (0 MB lost due to fragmentation)
 
   %GC time   1.0%  (1.0% elapsed)
 
   Productivity  97.4% of total user, 99.6% of total elapsed
 
 ./A +RTS -A50M -s  0.10s user 0.05s system 97% cpu 0.157 total
 
 
 
 -
 
 
 On Sat, Dec 1, 2012 at 11:09 AM, Branimir Maksimovic bm...@hotmail.com 
 wrote:
  

Re: [Haskell-cafe] Why is boxed mutable array so slow?

2012-12-01 Thread Don Stewart
Regarding when to use mutable arrays versus vectors, I would always
use vectors -- they optimize better, and have a better interface.

Also, I have updated and released a new version of the tool mentioned below.
You can get it on Hackage, updated to ghc 7 series.

http://hackage.haskell.org/package/ghc-gc-tune-0.3

For your boxed vector program, we get results that show a clear
performance peak with a -A of around 64M, about  the size of the
allocated array ...

http://i.imgur.com/dZ2Eo.png

Best settings for Running time:
0.16s:  +RTS -A67108864 -H1048576
0.16s:  +RTS -A67108864 -H2097152
0.16s:  +RTS -A67108864 -H8388608

E.g.

$ time ./A +RTS -A67M -H1M
boxed vector
last 945735787 seconds 0.123

-- Don

On Sat, Dec 1, 2012 at 12:20 PM, Don Stewart don...@gmail.com wrote:
 The obvious difference between boxed and unboxed arrays is that the
 boxed arrays are full of pointers to heap allocated objects. This
 means you pay indirection to access the values, much more time in GC
 spent chasing pointers (though card marking helps), and generally do
 more allocation.

 Compare the GC stats below, for

 * Boxed vector: 88M bytes copied; 75% of time in GC, 0.472s
 * Unboxed vector: 11k bytes copied, 1.3% of time in GC, 0.077s

 So there's your main answer. The increased data density of unboxed
 arrays also helps a too.

 Now, you can help out  the GC signifcantly by hinting at how much
 you're going to allocated in the youngest generation (see the
 ghc-gc-tune app for a methodical approach to this, though it needs
 updating to ghc 7 --
 http://donsbot.wordpress.com/2010/07/05/ghc-gc-tune-tuning-haskell-gc-settings-for-fun-and-profit/
  and 
 http://stackoverflow.com/questions/3171922/ghcs-rts-options-for-garbage-collection
 ).

 Use the +RTS -A flag to set an initial youngest generation heap size
 to the size of your array, and watch the GC cost disappear. For our
 boxed vector, we'd use +RTS -A50M, resulting in:

 * Boxed vector: 8k copied, 1% of time in GC, 0.157s

 So not bad. 3x speedup through a RTS flag. -A is very useful if you
 are working with boxed, mutable arrays.

 For reference, there's a generic version below that specializes based
 on the vector type parameter.

 -

 {-# LANGUAGE BangPatterns #-}

 import System.CPUTime
 import Text.Printf
 import Data.Int
 import Control.DeepSeq
 import System.Mem

 import qualified Data.Vector.Mutable as V
 import qualified Data.Vector.Unboxed.Mutable as U
 import qualified Data.Vector.Generic.Mutable as G

 main :: IO()
 main = do

 --   (G.new n' :: IO (V.IOVector Int32)) = test' boxed vector
 --   performGC
(G.new n' :: IO (U.IOVector Int32)) = test' unboxed vector
performGC

 test' s a = do
 putStrLn s
 begin - getCPUTime
 init'' a
 partial_sum' a
 end - getCPUTime
 let diff = (fromIntegral (end - begin)) / (10**12)
 last - G.read a (n'-1)
 printf last %d seconds %.3f\n last (diff::Double)

 n' :: Int
 n' = 1000 * 1000

 init'' !a = init 0 (n'-1)
   where
 init :: Int - Int - IO ()
 init !k !n
 | k  n = return ()
 | otherwise = do
 let !x = fromIntegral $ k + k `div` 3
 G.write a k x
 init (k+1) n



 partial_sum' !a = do
 k - G.read a 0
 ps 1 (n'-1) k
   where
 ps :: Int - Int - Int32 - IO ()
 ps i n s
 | i  n = return ()
 | otherwise = do
 k - G.read a i
 let !l = fromIntegral $ s + k
 G.write a i l
 ps (i+1) n l


 -

 $ time ./A +RTS -s
 boxed vector
 last 945735787 seconds 0.420
   40,121,448 bytes allocated in the heap
   88,355,272 bytes copied during GC
   24,036,456 bytes maximum residency (6 sample(s))
  380,632 bytes maximum slop
   54 MB total memory in use (0 MB lost due to fragmentation)

   %GC time  75.2%  (75.9% elapsed)

   Alloc rate359,655,602 bytes per MUT second

 ./A +RTS -s  0.40s user 0.07s system 98% cpu 0.475 total


 $ time ./A +RTS -s
 unboxed vector
 last 945735787 seconds 0.080
4,113,568 bytes allocated in the heap
   11,288 bytes copied during GC
4,003,256 bytes maximum residency (3 sample(s))
  182,856 bytes maximum slop
5 MB total memory in use (0 MB lost due to fragmentation)

   %GC time   1.3%  (1.3% elapsed)

   Alloc rate51,416,660 bytes per MUT second

 ./A +RTS -s  0.08s user 0.01s system 98% cpu 0.088 total


 $ time ./A +RTS -A50M -s
 boxed vector
 last 945735787 seconds 0.127
   40,121,504 bytes allocated in the heap
8,032 bytes copied during GC
   44,704 bytes maximum residency (2 sample(s))
   20,832 bytes maximum slop
   59 MB total memory in use (0 MB lost due to fragmentation)

   %GC time   1.0%  (1.0% elapsed)

   Productivity  97.4% of total user, 99.6% of total elapsed

 ./A +RTS -A50M -s  0.10s user 

Re: [Haskell-cafe] Why is boxed mutable array so slow?

2012-12-01 Thread Branimir Maksimovic

Thanks! I have downloaded tool and playing with it.I will use boxed vectors in 
the future ;)


 Date: Sat, 1 Dec 2012 13:22:47 -0500
 Subject: Re: [Haskell-cafe] Why is boxed mutable array so slow?
 From: don...@gmail.com
 To: bm...@hotmail.com
 CC: haskell-cafe@haskell.org
 
 Regarding when to use mutable arrays versus vectors, I would always
 use vectors -- they optimize better, and have a better interface.
 
 Also, I have updated and released a new version of the tool mentioned below.
 You can get it on Hackage, updated to ghc 7 series.
 
 http://hackage.haskell.org/package/ghc-gc-tune-0.3
 
 For your boxed vector program, we get results that show a clear
 performance peak with a -A of around 64M, about  the size of the
 allocated array ...
 
 http://i.imgur.com/dZ2Eo.png
 
 Best settings for Running time:
 0.16s:  +RTS -A67108864 -H1048576
 0.16s:  +RTS -A67108864 -H2097152
 0.16s:  +RTS -A67108864 -H8388608
 
 E.g.
 
 $ time ./A +RTS -A67M -H1M
 boxed vector
 last 945735787 seconds 0.123
 
 -- Don
 
 On Sat, Dec 1, 2012 at 12:20 PM, Don Stewart don...@gmail.com wrote:
  The obvious difference between boxed and unboxed arrays is that the
  boxed arrays are full of pointers to heap allocated objects. This
  means you pay indirection to access the values, much more time in GC
  spent chasing pointers (though card marking helps), and generally do
  more allocation.
 
  Compare the GC stats below, for
 
  * Boxed vector: 88M bytes copied; 75% of time in GC, 0.472s
  * Unboxed vector: 11k bytes copied, 1.3% of time in GC, 0.077s
 
  So there's your main answer. The increased data density of unboxed
  arrays also helps a too.
 
  Now, you can help out  the GC signifcantly by hinting at how much
  you're going to allocated in the youngest generation (see the
  ghc-gc-tune app for a methodical approach to this, though it needs
  updating to ghc 7 --
  http://donsbot.wordpress.com/2010/07/05/ghc-gc-tune-tuning-haskell-gc-settings-for-fun-and-profit/
   and 
  http://stackoverflow.com/questions/3171922/ghcs-rts-options-for-garbage-collection
  ).
 
  Use the +RTS -A flag to set an initial youngest generation heap size
  to the size of your array, and watch the GC cost disappear. For our
  boxed vector, we'd use +RTS -A50M, resulting in:
 
  * Boxed vector: 8k copied, 1% of time in GC, 0.157s
 
  So not bad. 3x speedup through a RTS flag. -A is very useful if you
  are working with boxed, mutable arrays.
 
  For reference, there's a generic version below that specializes based
  on the vector type parameter.
 
  -
 
  {-# LANGUAGE BangPatterns #-}
 
  import System.CPUTime
  import Text.Printf
  import Data.Int
  import Control.DeepSeq
  import System.Mem
 
  import qualified Data.Vector.Mutable as V
  import qualified Data.Vector.Unboxed.Mutable as U
  import qualified Data.Vector.Generic.Mutable as G
 
  main :: IO()
  main = do
 
  --   (G.new n' :: IO (V.IOVector Int32)) = test' boxed vector
  --   performGC
 (G.new n' :: IO (U.IOVector Int32)) = test' unboxed vector
 performGC
 
  test' s a = do
  putStrLn s
  begin - getCPUTime
  init'' a
  partial_sum' a
  end - getCPUTime
  let diff = (fromIntegral (end - begin)) / (10**12)
  last - G.read a (n'-1)
  printf last %d seconds %.3f\n last (diff::Double)
 
  n' :: Int
  n' = 1000 * 1000
 
  init'' !a = init 0 (n'-1)
where
  init :: Int - Int - IO ()
  init !k !n
  | k  n = return ()
  | otherwise = do
  let !x = fromIntegral $ k + k `div` 3
  G.write a k x
  init (k+1) n
 
 
 
  partial_sum' !a = do
  k - G.read a 0
  ps 1 (n'-1) k
where
  ps :: Int - Int - Int32 - IO ()
  ps i n s
  | i  n = return ()
  | otherwise = do
  k - G.read a i
  let !l = fromIntegral $ s + k
  G.write a i l
  ps (i+1) n l
 
 
  -
 
  $ time ./A +RTS -s
  boxed vector
  last 945735787 seconds 0.420
40,121,448 bytes allocated in the heap
88,355,272 bytes copied during GC
24,036,456 bytes maximum residency (6 sample(s))
   380,632 bytes maximum slop
54 MB total memory in use (0 MB lost due to fragmentation)
 
%GC time  75.2%  (75.9% elapsed)
 
Alloc rate359,655,602 bytes per MUT second
 
  ./A +RTS -s  0.40s user 0.07s system 98% cpu 0.475 total
 
 
  $ time ./A +RTS -s
  unboxed vector
  last 945735787 seconds 0.080
 4,113,568 bytes allocated in the heap
11,288 bytes copied during GC
 4,003,256 bytes maximum residency (3 sample(s))
   182,856 bytes maximum slop
 5 MB total memory in use (0 MB lost due to fragmentation)
 
%GC time   1.3%  (1.3% elapsed)
 
Alloc rate51,416,660 bytes per MUT second
 
  ./A +RTS -s  0.08s user 0.01s system 98% cpu 0.088 total
 
 
  $ time ./A +RTS -A50M -s

Re: [Haskell-cafe] Simple GUI Form

2012-12-01 Thread Rune Harder Bak
Sure, and thanks for sharing.
We need more declarative GUI building!

On Sat, Dec 1, 2012 at 9:00 AM,  timothyho...@seznam.cz wrote:
 I created that for a specific application I was writing, where I was saving
 the data directly to disk.  I modularized as an afterthought.  But I don't
 see anything wrong with using types that have JS in their names.  It is
 STILL a haskell data structure.  Just one which can be converted to a
 relatively universally parse-able String easily when needed ;)

 It's fine if you don't use gtk.  I just wanted to show off my code :)

 Timothy


 -- Původní zpráva --
 Od: Rune Harder Bak r...@bak.dk
 Datum: 30. 11. 2012
 Předmět: Re: [Haskell-cafe] Simple GUI Form


 I know it's not wx, but if you were willing to use GTK, you could simply
 install:

 http://hackage.haskell.org/package/gtk-jsinput

 and generate the form automatically as described in:


 https://github.com/timthelion/gtk-jsinput/blob/master/Graphics/UI/Gtk/Custom/JSInput.hs

 Cool!

 Would rather see it generated from the haskell data structure than from
 json,
 but you could of cause generate json from the data type.

 My understand is that GTK applications don't look and feel native on
 windows/mac, that's why I went for wx,
 but I could easily be wrong.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Simple GUI Form

2012-12-01 Thread timothyhobbs
Also GTK, but I ran across this a little while ago http://hackage.haskell.
org/package/barrie not sure if it still builds as it's a bit old, but it 
looked interesting.


-- Původní zpráva --
Od: Rune Harder Bak r...@bak.dk
Datum: 1. 12. 2012
Předmět: Re: Re: [Haskell-cafe] Simple GUI Form

Sure, and thanks for sharing.
We need more declarative GUI building!

On Sat, Dec 1, 2012 at 9:00 AM, timothyho...@seznam.cz wrote:
 I created that for a specific application I was writing, where I was 
saving
 the data directly to disk. I modularized as an afterthought. But I don't
 see anything wrong with using types that have JS in their names. It is
 STILL a haskell data structure. Just one which can be converted to a
 relatively universally parse-able String easily when needed ;)

 It's fine if you don't use gtk. I just wanted to show off my code :)

 Timothy


 -- Původní zpráva --
 Od: Rune Harder Bak r...@bak.dk
 Datum: 30. 11. 2012
 Předmět: Re: [Haskell-cafe] Simple GUI Form


 I know it's not wx, but if you were willing to use GTK, you could simply
 install:

 http://hackage.haskell.org/package/gtk-jsinput
(http://hackage.haskell.org/package/gtk-jsinput)

 and generate the form automatically as described in:


 https://github.com/timthelion/gtk-jsinput/blob/master/Graphics/UI/Gtk/
Custom/JSInput.hs
(https://github.com/timthelion/gtk-jsinput/blob/master/Graphics/UI/Gtk/Custom/JSInput.hs)

 Cool!

 Would rather see it generated from the haskell data structure than from
 json,
 but you could of cause generate json from the data type.

 My understand is that GTK applications don't look and feel native on
 windows/mac, that's why I went for wx,
 but I could easily be wrong.___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] debugging memory corruption

2012-12-01 Thread Evan Laforge
Ever since upgrading to 7.6.1 I regularly get panics like this:

seq: internal error: evacuate: strange closure type -1958168540
(GHC version 7.6.1 for x86_64_apple_darwin)
Please report this as a GHC bug:  http://www.haskell.org/ghc/reportabug

I've seen some variations, but basically I think it just means someone
is corrupting memory and it goes unnoticed until the GC trips over it.

This happens infrequently (maybe once in 15m, very roughly, it's
extremely inconsistent) in non-optimized code, and frequently (maybe
once in 5m) in optimized code.  This only happens during interactive
use, not during testing or profiling.

I had a problem like this once before, and it took a very long time to
track down.  And in fact, I never really tracked it down, I just got a
test that could semi-reliably reproduce it, and then by trial and
error discovered that if I changed the alignment of a particular
Storable instance from 1 to 4, the problem stopped happening (and 1
should have been correct, it was a struct of chars).  Not exactly a
satisfying solution, and now I'm thinking all I did was make ghc 6
stop manifesting it, and with 7 it's back again.

I'm most suspicious of the FFI usage since it's easy to corrupt memory
in C++ and even easier to write a bogus Storable instance that does
the same, but I really have no idea what or where.  My first thought
was to start cutting out portions (especially FFI-using portions) to
try to isolate it, but it's slow going because it can sometimes take
quite a while to the bug to come out again.  My second thought was
that I need a more automated way to reproduce it, but it's nontrivial
because it only comes up when I'm using the interactive GUI parts,
which are also incidentally a big chunk of FFI.  And even if I do get
a repro, as I did before, it just means I can more quickly poke
randomly at things hoping they change it, but even if I can get it to
stop happening it doesn't mean I understood it, or even really fixed
it.  This is also the kind of bug (well, it was last time), which is
highly dependent on the code, so add one print and it stops happening.
 I have a sort of complicated scheme where I pass a wrapped haskell
function callback along with a wrapped freeHaskellFunPtr to free the
last one, along with itself, maybe it's something to do with that.

Anyone out there with ideas or advice on how to track down this kind
of bug?  My next thought is to try to automate the GUI parts, or maybe
just the FFI part, so I can write a program to randomly fuss with it
until it crashes.  I've also tried valgrind, but it doesn't report
anything suspicious.  But it also doesn't seem to work on FFI Storable
corruption, I've tried intentionally inserting a bad poke and valgrind
still won't report it.

Thanks in advance for any insight!


Actually, there's a whole other discussion which has been nagging at
me for a while, though another thread would be more appropriate.  But
in short it's that it feels like hsc2hs is just too low level, and too
error-prone.  It's tempting to use because it comes with ghc, but it
seems bad to tell people haskell is a safe language, but as soon as
you want to talk to C you're writing totally unchecked pokes and
peeks.  Maybe I should go evaluate the alternatives like c2hs, or
maybe safety features can added to hsc2hs.  Wouldn't it be nice to
have ghc come with a high level and safe FFI language?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Can not use ST monad with polymorphic function

2012-12-01 Thread David Menendez
On Thu, Nov 29, 2012 at 7:50 AM, Dmitry Kulagin dmitry.kula...@gmail.comwrote:

 Thank you, MigMit!

 If I replace your type FoldSTVoid with:
 data FoldMVoid = FoldMVoid {runFold :: Monad m = (Int - m ()) - m ()}

 then everything works magically with any monad!
 That is exactly what I wanted, though I still do not quite understand why
 wrapping the type solves the problem


Short answer: It's because GHC's type system is predicative.

Basically, quantified types can't be given as arguments to type
constructors (other than -, which is its own thing). I'm not entirely sure
why, but it apparently makes the type system very complicated from a
theoretical standpoint. By wrapping the quantified type in a newtype, the
argument to IO becomes simple enough not to cause problems.

-- 
Dave Menendez d...@zednenem.com
http://www.eyrie.org/~zednenem/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Why is boxed mutable array so slow?

2012-12-01 Thread Daniel Fischer
On Samstag, 1. Dezember 2012, 16:09:05, Branimir Maksimovic wrote:
 All in all even unboxed array is about 10 times slower than Java version.
 I don't understand why is even unboxed array so slow.

It's not the unboxed arrays that are slow.

Your code has a couple of weak spots, and GHC's native code generator has a 
weakness that bites here.

On my box, I don't quite have a 10× difference to my translation to Java, it's 
a bit less than 7× (0.82s vs 0.12s - I don't want to bring my box to its knees 
by running something that takes 3GB+ of RAM, so I run the unboxed array part 
only) with the LLVM backend and 8× (0.93s) with the native code generator. 
That's in the same ballpark, though.

So what's the deal?

Main.main_$s$wa1 [Occ=LoopBreaker]
  :: GHC.Prim.Int#
 - GHC.Prim.Int#
 - GHC.Prim.State# GHC.Prim.RealWorld
 - GHC.Types.Int
 - GHC.Types.Int
 - GHC.Types.Int
 - ...

Your loops carry boxed Ints around, that's always a bad sign. In this case it 
doesn't hurt too much, however, since these values are neither read nor 
substituted during the loop (they're first and last index of the array and 
number of elements). Additionally, they carry an IOUArray constructor around. 
That is unnecessary. Eliminating a couple of dead parameters


init' a = do
(_,n) - getBounds a
let init k
  | k  n = return ()
  | otherwise = do
  let x = fromIntegral $ k + k `div` 3
  unsafeWrite a k x
  init (k+1)
init 0

partial_sum a = do
(_,n) - getBounds a
let ps i s
  | i  n = return ()
  | otherwise = do
  k - unsafeRead a i
  let l = s + k
  unsafeWrite a i l
  ps (i+1) l
k - unsafeRead a 0
ps 1 k

brings the time for the native code generator down to 0.82s, and for the LLVM 
backend the time remains the same.

Next problem, you're using `div` for the division.

`div` does some checking and potentially fixup (not here, since everything is 
non-negative) after the machine division because `div` is specified to satisfy

a = (a `div` b) * b + (a `mod` b)

with 0 = a `mod` b  abs b.

That is in itself slower than the pure machine division you get with quot.

So let's see what we get with `quot`.

0.65s with the native code generator, and 0.13 with the LLVM backend.

Whoops, what's that?

The problem is, as can be seen by manually replacing k `quot` 3 with

(k *2863311531) `shiftR` 33

(requires 64-bit Ints; equivalent in Java: k*28..1L  33), when the native 
backend, the LLVM backend and Java (as well as C) all take more or less the 
same time [well, the NCG is a bit slower than the other two, 0.11s, 0.11s, 
0.14s], that division is a **very** slow operation.

Java and LLVM know how to replace the division by the constant 3 with a 
mulitplication, a couple of shifts and an addition (since we never have 
negative numbers here, just one multiplication and shift suffice, but neither 
Java nor LLVM can do that on their own because it's not guaranteed by the 
type). The native code generator doesn't - not yet.

So the programme spends the majority of the time dividing. The array reads and 
writes are on par with Java's (and, for that matter, C's).

If you make the divisor a parameter instead of a compile time constant, the 
NCG is not affected at all, the LLVM backend gives you equal performance (it 
can't optimise a division by a divisor it doesn't know). Java is at an 
advantage there, after a while the JIT sees that it might be a good idea to 
optimise the division and so its time only trebles.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is anyone working on a sparse matrix library in Haskell?

2012-12-01 Thread Adrien Haxaire
Hello,

I started a FEM library, funfem [1], but I stopped it for the moment; Haskell 
is my hobby and I work on FEM all day long, I prefer to focus on orthogonal 
problems for home projects. It is a very naive implementation. Far from a 
version 0.0.1 too, i.e. unusable at the moment. 

I did not test the performance as it was not my main goal, so the following 
may not be completely relevant to your purpose.

I define a type Tensor [2], which is a sparse matrix based on Data.Map. Not 
sure how efficient it is, I chose to start simple. The resulting conjugate 
gradient [3] is very clear.

Please let us know how it goes, it's good to see more traction towards Haskell 
from our field, and I'll be glad to use your library !

Best regards,
Adrien

[1] http://www.funfem.org

[2] 
https://github.com/adrienhaxaire/funfem/blob/master/Numeric/Funfem/Algebra/Tensor.hs

[3] 
https://github.com/adrienhaxaire/funfem/blob/master/Numeric/Funfem/Algebra/Solver/CG.hs



On Thursday 29 November 2012 14:03:04 Mark Flamer wrote:
 I am looking to continue to learn Haskell while working on something that
 might eventually be useful to others and get posted on Hackage. I have
 written quite a bit of Haskell code now, some useful and a lot just throw
 away for learning. In the past others have expressed interest in having a
 native Haskell sparse matrix and linear algebra library available(not just
 bindings to a C lib). This in combination with FEM is one of my interests.
 So my questions, is anyone currently working on a project like this? Does it
 seem like a good project/addition to the community? I'm also interested if
 anyone has any other project idea's, maybe even to collaborate on. Thanks
 
 
 
 --
 View this message in context:
 http://haskell.1045720.n5.nabble.com/Is-anyone-working-on-a-sparse-matrix-l
 ibrary-in-Haskell-tp5721452.html Sent from the Haskell - Haskell-Cafe
 mailing list archive at Nabble.com.
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
-- 
Adrien Haxaire
www.adrienhaxaire.org | @adrienhaxaire

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is anyone working on a sparse matrix library in Haskell?

2012-12-01 Thread Adrien Haxaire
Woops, forgot I switched to darcs after some time. The latest version can be 
found here:

http://www.funfem.org/browser/Numeric/Funfem/Algebra

Adrien

On Sunday 02 December 2012 00:00:32 Adrien Haxaire wrote:
 Hello,
 
 I started a FEM library, funfem [1], but I stopped it for the moment;
 Haskell is my hobby and I work on FEM all day long, I prefer to focus on
 orthogonal problems for home projects. It is a very naive implementation.
 Far from a version 0.0.1 too, i.e. unusable at the moment.
 
 I did not test the performance as it was not my main goal, so the following
 may not be completely relevant to your purpose.
 
 I define a type Tensor [2], which is a sparse matrix based on Data.Map. Not
 sure how efficient it is, I chose to start simple. The resulting conjugate
 gradient [3] is very clear.
 
 Please let us know how it goes, it's good to see more traction towards
 Haskell from our field, and I'll be glad to use your library !
 
 Best regards,
 Adrien
 
 [1] http://www.funfem.org
 
 [2]
 https://github.com/adrienhaxaire/funfem/blob/master/Numeric/Funfem/Algebra/T
 ensor.hs
 
 [3]
 https://github.com/adrienhaxaire/funfem/blob/master/Numeric/Funfem/Algebra/S
 olver/CG.hs
 On Thursday 29 November 2012 14:03:04 Mark Flamer wrote:
  I am looking to continue to learn Haskell while working on something that
  might eventually be useful to others and get posted on Hackage. I have
  written quite a bit of Haskell code now, some useful and a lot just throw
  away for learning. In the past others have expressed interest in having a
  native Haskell sparse matrix and linear algebra library available(not just
  bindings to a C lib). This in combination with FEM is one of my interests.
  So my questions, is anyone currently working on a project like this? Does
  it seem like a good project/addition to the community? I'm also
  interested if anyone has any other project idea's, maybe even to
  collaborate on. Thanks
  
  
  
  --
  View this message in context:
  http://haskell.1045720.n5.nabble.com/Is-anyone-working-on-a-sparse-matrix-
  l
  ibrary-in-Haskell-tp5721452.html Sent from the Haskell - Haskell-Cafe
  mailing list archive at Nabble.com.
  
  ___
  Haskell-Cafe mailing list
  Haskell-Cafe@haskell.org
  http://www.haskell.org/mailman/listinfo/haskell-cafe
-- 
Adrien Haxaire
www.adrienhaxaire.org | @adrienhaxaire

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] what is the purpose of GHC.Exts.lazy?

2012-12-01 Thread jared simpson
docs say:

The call '(lazy e)' means the same as e, but 
lazyhttp://hackage.haskell.org/packages/archive/base/latest/doc/html/GHC-Exts.html#v:lazy
 has 
a magical strictness property: it is lazy in its first argument, even 
though its semantics is strict.
why do i want to use magic during strictness analysis?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] debugging memory corruption

2012-12-01 Thread Alexander Kjeldaas
What I've mostly done in similar circumstances (jni)

1. Create an interface (virtual functions or template) for the FFI in C++
that covers everything you use. Then create one test implementation and one
real implementation. The test implementation must allocate resources
whenever the real FFI does so. Doing memory allocation works. This makes it
possible to test all your FFI in C++ using valgrind.

2. Add tracing support to the real implementation and replay support to the
test implementation.

3. Upload to Hackage.

Alexander
On Dec 1, 2012 5:06 PM, Evan Laforge qdun...@gmail.com wrote:

 Ever since upgrading to 7.6.1 I regularly get panics like this:

 seq: internal error: evacuate: strange closure type -1958168540
 (GHC version 7.6.1 for x86_64_apple_darwin)
 Please report this as a GHC bug:
 http://www.haskell.org/ghc/reportabug

 I've seen some variations, but basically I think it just means someone
 is corrupting memory and it goes unnoticed until the GC trips over it.

 This happens infrequently (maybe once in 15m, very roughly, it's
 extremely inconsistent) in non-optimized code, and frequently (maybe
 once in 5m) in optimized code.  This only happens during interactive
 use, not during testing or profiling.

 I had a problem like this once before, and it took a very long time to
 track down.  And in fact, I never really tracked it down, I just got a
 test that could semi-reliably reproduce it, and then by trial and
 error discovered that if I changed the alignment of a particular
 Storable instance from 1 to 4, the problem stopped happening (and 1
 should have been correct, it was a struct of chars).  Not exactly a
 satisfying solution, and now I'm thinking all I did was make ghc 6
 stop manifesting it, and with 7 it's back again.

 I'm most suspicious of the FFI usage since it's easy to corrupt memory
 in C++ and even easier to write a bogus Storable instance that does
 the same, but I really have no idea what or where.  My first thought
 was to start cutting out portions (especially FFI-using portions) to
 try to isolate it, but it's slow going because it can sometimes take
 quite a while to the bug to come out again.  My second thought was
 that I need a more automated way to reproduce it, but it's nontrivial
 because it only comes up when I'm using the interactive GUI parts,
 which are also incidentally a big chunk of FFI.  And even if I do get
 a repro, as I did before, it just means I can more quickly poke
 randomly at things hoping they change it, but even if I can get it to
 stop happening it doesn't mean I understood it, or even really fixed
 it.  This is also the kind of bug (well, it was last time), which is
 highly dependent on the code, so add one print and it stops happening.
  I have a sort of complicated scheme where I pass a wrapped haskell
 function callback along with a wrapped freeHaskellFunPtr to free the
 last one, along with itself, maybe it's something to do with that.

 Anyone out there with ideas or advice on how to track down this kind
 of bug?  My next thought is to try to automate the GUI parts, or maybe
 just the FFI part, so I can write a program to randomly fuss with it
 until it crashes.  I've also tried valgrind, but it doesn't report
 anything suspicious.  But it also doesn't seem to work on FFI Storable
 corruption, I've tried intentionally inserting a bad poke and valgrind
 still won't report it.

 Thanks in advance for any insight!


 Actually, there's a whole other discussion which has been nagging at
 me for a while, though another thread would be more appropriate.  But
 in short it's that it feels like hsc2hs is just too low level, and too
 error-prone.  It's tempting to use because it comes with ghc, but it
 seems bad to tell people haskell is a safe language, but as soon as
 you want to talk to C you're writing totally unchecked pokes and
 peeks.  Maybe I should go evaluate the alternatives like c2hs, or
 maybe safety features can added to hsc2hs.  Wouldn't it be nice to
 have ghc come with a high level and safe FFI language?

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] To my boss: The code is cool, but it is about 100 times slower than the old one...

2012-12-01 Thread wren ng thornton

On 11/29/12 2:17 PM, Ivan Salazar wrote:

The bad side is that direct translation of algorithms are almost always
very slow and the work needed to make them perform is very mind bending.


Indeed. The thing is, all algorithms make (implicit) assumptions about 
the cost model of the underlying language. The problem comes about 
because the assumptions made in most algorithms are not valid in 
Haskell. This isn't just an issue of laziness; the entire computational 
model of Haskell (e.g., STG) is radically different from imperative 
languages.


For example, in many cases just passing arguments around (a la the State 
monad) is much faster than using ST and explicit mutation. GHC does a 
lot of clever code reorganization, but mutation breaks countless purity 
laws and so it inhibits the application of these optimizations. 
Unfortunately, the vast majority of algorithms out there assume you're 
working in a language where mutation is essentially free. I'm not 
talking about any significant theoretical issues about using mutation or 
not; I'm just talking about the implicit assumptions that algorithm 
implementers make. If you believe mutation is essentially free then it 
makes sense to create an initial object and then incrementally mutate it 
this way and that until you get to where you want. But, a great deal of 
the time there's nothing actually stopping us from gathering all the 
information and allocating the final result directly. However, if you're 
not used to thinking in those terms, then the conceptual reorganization 
required to see how to gather all the data at once is indeed mind-bending.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is anyone working on a sparse matrix library in Haskell?

2012-12-01 Thread wren ng thornton

On 11/30/12 4:58 PM, Mark Flamer wrote:

Thanks for all the replies,
  It sounds like there is enough interest and even some potential
collaborators out there. I have created a few data structures to represent
sparse vectors and matrices. The vector was a simple binary tree and the
matrix a quad tree. As I suspected a standard IntMap was around 3X as fast
as my binary tree, so I have switched to the IntMap for now. I was hoping to
hold on to my quad tree for the matrix rep. because I like the elegance of
the recursive algorithms like Strassen’s for multiplication. In the end it
will most likely be best to have a few different matrix representations
anyway. For instance, I know that compressed row form is most efficient for
certain algorithms. I have a Gauss iterative solver working currently and am
thinking of moving on to a parallel Conjugate gradient or direct solver
using LU decomposition. I’m no expert in Haskell or numeric methods but I do
my best.


I've also been working haphazardly on some similar stuff lately. 
However, my focus is rather different[1] so I'm afeared not much code 
sharing could happen at the moment. While I'm certainly no expert on 
numerical methods, I seem to have acquired some experience in that 
domain so I may be able to lend a hand from time to time.



[1] In particular my goal has been to revive some old ideas about making 
linear algebra well-typed. The vast majority (if not all) of extant 
linear algebra systems are poorly typed and will do stupid things to 
resolve type errors (e.g., automatically padding, truncating, and 
reshaping things). Because of the use case I have in mind, this project 
also involves setting up a proper numerical type-class hierarchy (i.e., 
one which expresses semirings and other things ignored by the numerical 
hierarchies out there today). My goal for all this is in setting up the 
type system, not performance. I figure there are other folks who know 
and care a lot more about the numerical tricks of giving the actual 
implementations.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Is anyone working on a sparse matrix library in Haskell?

2012-12-01 Thread Kim-Ee Yeoh
On Sun, Dec 2, 2012 at 10:52 AM, wren ng thornton w...@freegeek.org wrote:

  My goal for all this is in setting up the type system, not performance.
I figure there are other folks who know and care a lot more about the
numerical tricks of giving the actual implementations.

You've got my support -- old-school optimizations, numerical, compiler, or
otherwise, are deadly boring. Death to them, I say! If we don't explore
uncharted waters, who will?

-- Kim-Ee


On Sun, Dec 2, 2012 at 10:52 AM, wren ng thornton w...@freegeek.org wrote:

 On 11/30/12 4:58 PM, Mark Flamer wrote:

 Thanks for all the replies,
   It sounds like there is enough interest and even some potential
 collaborators out there. I have created a few data structures to represent
 sparse vectors and matrices. The vector was a simple binary tree and the
 matrix a quad tree. As I suspected a standard IntMap was around 3X as fast
 as my binary tree, so I have switched to the IntMap for now. I was hoping
 to
 hold on to my quad tree for the matrix rep. because I like the elegance of
 the recursive algorithms like Strassen’s for multiplication. In the end it
 will most likely be best to have a few different matrix representations
 anyway. For instance, I know that compressed row form is most efficient
 for
 certain algorithms. I have a Gauss iterative solver working currently and
 am
 thinking of moving on to a parallel Conjugate gradient or direct solver
 using LU decomposition. I’m no expert in Haskell or numeric methods but I
 do
 my best.


 I've also been working haphazardly on some similar stuff lately. However,
 my focus is rather different[1] so I'm afeared not much code sharing could
 happen at the moment. While I'm certainly no expert on numerical methods, I
 seem to have acquired some experience in that domain so I may be able to
 lend a hand from time to time.


 [1] In particular my goal has been to revive some old ideas about making
 linear algebra well-typed. The vast majority (if not all) of extant linear
 algebra systems are poorly typed and will do stupid things to resolve
 type errors (e.g., automatically padding, truncating, and reshaping
 things). Because of the use case I have in mind, this project also involves
 setting up a proper numerical type-class hierarchy (i.e., one which
 expresses semirings and other things ignored by the numerical hierarchies
 out there today). My goal for all this is in setting up the type system,
 not performance. I figure there are other folks who know and care a lot
 more about the numerical tricks of giving the actual implementations.

 --
 Live well,
 ~wren


 __**_
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Why is boxed mutable array so slow?

2012-12-01 Thread Branimir Maksimovic

Wow, now performance is on par with Java ;)So slow division was main problem, 
that and GC .
Thanks!

 From: daniel.is.fisc...@googlemail.com
 To: haskell-cafe@haskell.org
 CC: bm...@hotmail.com
 Subject: Re: [Haskell-cafe] Why is boxed mutable array so slow?
 Date: Sat, 1 Dec 2012 21:12:29 +0100
 
 On Samstag, 1. Dezember 2012, 16:09:05, Branimir Maksimovic wrote:
  All in all even unboxed array is about 10 times slower than Java version.
  I don't understand why is even unboxed array so slow.
 
 It's not the unboxed arrays that are slow.
 
 Your code has a couple of weak spots, and GHC's native code generator has a 
 weakness that bites here.
 
 On my box, I don't quite have a 10× difference to my translation to Java, 
 it's 
 a bit less than 7× (0.82s vs 0.12s - I don't want to bring my box to its 
 knees 
 by running something that takes 3GB+ of RAM, so I run the unboxed array part 
 only) with the LLVM backend and 8× (0.93s) with the native code generator. 
 That's in the same ballpark, though.
 
 So what's the deal?
 
 Main.main_$s$wa1 [Occ=LoopBreaker]
   :: GHC.Prim.Int#
  - GHC.Prim.Int#
  - GHC.Prim.State# GHC.Prim.RealWorld
  - GHC.Types.Int
  - GHC.Types.Int
  - GHC.Types.Int
  - ...
 
 Your loops carry boxed Ints around, that's always a bad sign. In this case it 
 doesn't hurt too much, however, since these values are neither read nor 
 substituted during the loop (they're first and last index of the array and 
 number of elements). Additionally, they carry an IOUArray constructor around. 
 That is unnecessary. Eliminating a couple of dead parameters
 
 
 init' a = do
 (_,n) - getBounds a
 let init k
   | k  n = return ()
   | otherwise = do
   let x = fromIntegral $ k + k `div` 3
   unsafeWrite a k x
   init (k+1)
 init 0
 
 partial_sum a = do
 (_,n) - getBounds a
 let ps i s
   | i  n = return ()
   | otherwise = do
   k - unsafeRead a i
   let l = s + k
   unsafeWrite a i l
   ps (i+1) l
 k - unsafeRead a 0
 ps 1 k
 
 brings the time for the native code generator down to 0.82s, and for the LLVM 
 backend the time remains the same.
 
 Next problem, you're using `div` for the division.
 
 `div` does some checking and potentially fixup (not here, since everything is 
 non-negative) after the machine division because `div` is specified to satisfy
 
 a = (a `div` b) * b + (a `mod` b)
 
 with 0 = a `mod` b  abs b.
 
 That is in itself slower than the pure machine division you get with quot.
 
 So let's see what we get with `quot`.
 
 0.65s with the native code generator, and 0.13 with the LLVM backend.
 
 Whoops, what's that?
 
 The problem is, as can be seen by manually replacing k `quot` 3 with
 
 (k *2863311531) `shiftR` 33
 
 (requires 64-bit Ints; equivalent in Java: k*28..1L  33), when the native 
 backend, the LLVM backend and Java (as well as C) all take more or less the 
 same time [well, the NCG is a bit slower than the other two, 0.11s, 0.11s, 
 0.14s], that division is a **very** slow operation.
 
 Java and LLVM know how to replace the division by the constant 3 with a 
 mulitplication, a couple of shifts and an addition (since we never have 
 negative numbers here, just one multiplication and shift suffice, but neither 
 Java nor LLVM can do that on their own because it's not guaranteed by the 
 type). The native code generator doesn't - not yet.
 
 So the programme spends the majority of the time dividing. The array reads 
 and 
 writes are on par with Java's (and, for that matter, C's).
 
 If you make the divisor a parameter instead of a compile time constant, the 
 NCG is not affected at all, the LLVM backend gives you equal performance (it 
 can't optimise a division by a divisor it doesn't know). Java is at an 
 advantage there, after a while the JIT sees that it might be a good idea to 
 optimise the division and so its time only trebles.
  ___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe