[Haskell-cafe] [ANN] Accelerate version 0.12: GPU computing with Haskell
We just released version 0.12 of Data.Array.Accelerate, the GPGPU[1] library for Haskell: http://justtesting.org/gpu-accelerated-array-computations-in-haskell This is a beta release. The library is not perfect, but it is definitely usable, and we are looking for early adopters. Manuel [1] Currently only NVIDIA GPUs are supported via NVIDIA's CUDA framework. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Can Haskell outperform C++?
On 05/11/2012 07:53 AM, Ertugrul Söylemez wrote: My point is: If you need C-like performance at a certain spot there is really no excuse for writing the entire application in C. Haskell has a working, powerful enough FFI. Also idiomatic Haskell code nowadays performs close to C. If your code doesn't, chances are that it's not even idiomatic. Sorting a list is easy and beautiful in code. But it's far from idiomatic. Using ST with destructive update is also not idiomatic. One of my performance masterpieces is the instinct AI library (see Hackage). It uses only immutable vectors and performs very heavy Double calculations, yet performs better than the same code with mutable arrays did. With a few years of Haskell experience in my backpack I know how to utilize laziness to get amazing performance for code that most people would feel must be written with destructively updating loop. And the resulting code is just beautiful to read and watch at work, a great demonstration of the wonders the GHC developers have created. Hello Ertugrul, Out of the curios, did you compare the performance of Instinct with implementations in languages, associated with numerical computations, like Octave? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] darcs patch dependencies in dot format
Sönke Hahn sh...@cs.tu-berlin.de writes: On 05/13/2012 03:13 AM, Felipe Almeida Lessa wrote: Truly amazing! Yes, nice work! I wonder it would fare with larger repositories. =) It does not scale well. [...] Somehow related questions are: What am I going to do with a dot-graph, that has more than 500 vertices? Is there an intelligent way to reduce the graph? Lacking a solution for the problem of drawing large graphs, making the graph smaller might be the second choice. :-) One option might be to only track dependencies back to a specified tag? Or between tags? -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Bug in SYB, SYB-documentation or GHC-API (or I messed up somewhere)
Dear Pedro, On 13 May 2012, at 20:29, José Pedro Magalhães wrote: On Fri, May 11, 2012 at 5:12 PM, Philip K. F. Hölzenspies p...@st-andrews.ac.uk wrote: However, it turns out that this never leads to an evaluation of extTyNamesLoc. I've come as far as to see that, since Located a is a synonym for GenLocated SrcSpan a, the type we're looking for really has a binary constructor, thus we should use ext2Q. This made things work: We first modify the function we apply to Located things: extTyNamesLoc :: (Data loc, Data a) = SrcSpan - GenLocated loc a - OccurrenceTable extTyNamesLoc l (L l' x) = case cast l' of Just l'' - extTyNames l'' x Nothing - extTyNames l x Do you really need this? Can't you use the definition of `extTyNamesLoc` shown previously, and redefine `extTyNames` to use `ext2Q`, as in: As it turns out, yes I do. You had me doubting whether I had tried this, but doing such a thing gives me: refact.hs:124:72: Could not deduce (d1 ~ SrcSpan) from the context (Data a) bound by the type signature for extTyNames :: Data a = SrcSpan - RefCtx - a - OccTab at refact.hs:124:1-91 or from (Data d1, Data d2) bound by a type expected by the context: (Data d1, Data d2) = GenLocated d1 d2 - OccTab at refact.hs:124:21-88 `d1' is a rigid type variable bound by a type expected by the context: (Data d1, Data d2) = GenLocated d1 d2 - OccTab at refact.hs:124:21 Expected type: GenLocated d1 d2 - OccTab Actual type: Located d2 - OccTab In the return type of a call of `extTyNamesLoc' In the second argument of `ext2Q', namely `extTyNamesLoc l r' Which makes a certain amount of sense. The use of ext2Q binds d1 and d2 to be restricted to Data, but otherwise unspecific. This seems at odds with the intention of extXY, hence my confusion. I have since found and come to understand extB, though. This makes the implementation of extTyNamesLoc much, much nicer. However, I still think finding a good explanation for the behaviour of ext1Q/ext2Q could lead to a helpful addendum of the SYB documentation. Regards, Philip___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Can Haskell outperform C++?
Dmitry Vyal akam...@gmail.com wrote: My point is: If you need C-like performance at a certain spot there is really no excuse for writing the entire application in C. Haskell has a working, powerful enough FFI. Also idiomatic Haskell code nowadays performs close to C. If your code doesn't, chances are that it's not even idiomatic. Sorting a list is easy and beautiful in code. But it's far from idiomatic. Using ST with destructive update is also not idiomatic. One of my performance masterpieces is the instinct AI library (see Hackage). It uses only immutable vectors and performs very heavy Double calculations, yet performs better than the same code with mutable arrays did. With a few years of Haskell experience in my backpack I know how to utilize laziness to get amazing performance for code that most people would feel must be written with destructively updating loop. And the resulting code is just beautiful to read and watch at work, a great demonstration of the wonders the GHC developers have created. Out of the curios, did you compare the performance of Instinct with implementations in languages, associated with numerical computations, like Octave? No, sorry. Greets, Ertugrul -- Key-ID: E5DD8D11 Ertugrul Soeylemez e...@ertes.de FPrint: BD28 3E3F BE63 BADD 4157 9134 D56A 37FA E5DD 8D11 Keysrv: hkp://subkeys.pgp.net/ signature.asc Description: PGP signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] darcs patch dependencies in dot format
On 5/12/12 5:52 AM, Sönke Hahn wrote: Hi all! Yesterday I wrote a little tool to output the dependencies of darcs patches in dot format. The hardest part was to wrap my head around the darcs API and find a way to let it compute the patch dependencies. I don't know, if I got it right, but it looks correct at first glance. Here is the code: https://patch-tag.com/r/shahn/darcs2dot To use it just execute the program in a darcs repo and it will output a dot graph to stdout. The graph does not contain all dependencies, but the transitive reduction. The patch names are truncated at 15 characters. And here is an example graph: http://open-projects.net/~shahn/patchDeps.pdf These are the patch dependencies of one of my darcs repos (https://patch-tag.com/r/shahn/hate). Some observations I made: * There are two completely separate subgraphs. One subgraph seems to be for the testsuite, the other for the actual code. This means, the two don't depend on each other and could easily be put in two distinct repos. * There is a re-implementation patch with a lot of incoming and outgoing edges. (Which makes sense.) * There are some sequences of patches (e.g. these four allow ... patches in the upper left corner) that seem to contain related patches. * darcs's patch system is awesome! Any comments or suggestions? Cheers, Sönke That's nifty, thanks for sharing it. Cc'ing darcs-user. I tried it in a few repos, like so: $ darcs2dot patchdeps.dot dot patchdeps.dot -Tpdf -o patchdeps.pdf In a 200-patch repo it ran quickly, giving: http://joyful.com/darcsden/simon/darcsum/raw/patchdeps.pdf In a 2000-patch repo it took 22 hours: http://joyful.com/darcsden/simon/hledger/raw/patchdeps.pdf In the 1-patch Darcs repo I killed it after a few hours, but here are the early dependencies (up to tag 1.0.0rc2): http://joyful.com/darcsden/simon/darcs-sm/raw/patchdeps.pdf It should escape double-quotes in patch names, I did that manually. I wonder how to simplify the graph further. Perhaps just the dependencies of tags would be interesting in some repos ? Best, -Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Can Haskell outperform C++?
Well, if it's in many ways the same as C, then again it's probably not idiomatic Haskell. It's just a recursive equation for mandelbrot fractals. I should have been precise, I didn't mean that the source is literally the *same* as the C source (i.e. there's no for loop, no mutable variables), rather that it presents the compiler backend with the same advantages as the C backend would have with the equivalent loop. That is, there's no control flow obfuscation (dictionaries, calls to unknown targets), no problems with laziness, and the data representations are completely known. mandel :: Int - Complex Double - Int mandel max_depth c = loop 0 0 where loop i !z | i == max_depth = i | magnitude z = 2.0 = i | otherwise = loop (i+1) (z*z + c) It's also a loop that is readily recognized as a loop. Now, to my knowledge, GHC doesn't explicitly recognize loops in any stage of the compiler (so as to perform loop optimizations), something which other functional compilers sometimes do. But, anyway, it turns out that my example above *is easily transformed from a bad GHC performance story into a good one*. If you'll bear with me, I'll show how below. First, Manuel makes a good point about the LLVM backend. My 6X anecdote was from a while ago and I didn't use llvm [1]. I redid it just now with 7.4.1+LLVM, results below. (The below table should read correctly in fixed width font, but you can also see the data in the spreadsheet herehttps://docs.google.com/spreadsheet/ccc?key=0AvzAHqQmHo87dHU0T0lCb1I4MFJmM2s4RnNlamJlNkE .) Time (ms) Compiled File size Comple+Runtime (ms) GHC 7.4.1 O024441241K GHC 7.4.1 O29251132K 1561 GHC 7.4.1 O2 llvm 931 1133K GHC 7.0.4 O2 via-C 684 974K So LLVM didn't help [1]. And in fact the deprecated via-C backend did the best! Compare with GCC: G++ O0 300 9K 531 G++ O3 110 7K 347 G++ O3 recursive 116 9K Uh oh, the 6X gap I mentioned is now closer to 9X. And, in fact, performing a mini language shootout on the above code, reveals that GHC is doing worse than not only OCaml, but Chez Scheme, in spite of dynamic type checks, and a necessarily boxed representation of complex numbers: Chez Scheme 8.4284 2.7K notStandalone 372 OCaml 166 180K 301 At least Python does worse! Python 2.6 1973NA 1973 *So here's the catch:* If you check the Core and STG GHC 7.4 is actually compiling the above loop very well. This microbenchmark turns into just a magnitude microbenchmark. The implementation of Data.Complex uses an EXTREMELY expensive method to avoid overflowhttps://github.com/ghc/packages-base/blob/master/Data/Complex.hs#L115 [2]. Since OCaml did well above, I took a look at their standard library's implementation, on line 51 herehttp://caml.inria.fr/cgi-bin/viewvc.cgi/ocaml/trunk/stdlib/complex.ml?revision=11156view=markup. They use a nice little math trick (the extra division) that is also mentioned on Wikipedia. By implementing the same trick in Haskell, replacing the standard magnitude functionhttps://github.com/rrnewton/MandelMicrobench/blob/97c3275ad94cbce57a688817332b42f7c32c15b4/mandel_test2.hs, we get the following results. GHC 7.4.1 No Overflow Avoidance 39 1127K674 GHC 741, OCaml-style Overflow avoidance 74 1127K Wow, now not only is the Haskell version faster than OCaml/Scheme, *but it is 48% faster than C++*, which is appropriate to the title of this email! Of course, this probably means that C++'s abs is also doing something overly expensive for overflow avoidance (or that their representation of complex numbers is not fully unpacked and stored in registers) I haven't tracked it down yet. But in any case, I'll be submitting a library patch. *The moral, I think, is that community members could do a great deal to help Haskell performance by simply microbenchmarking and optimizing library routines in base!* Cheers, -Ryan P.S. You can check out the above benchmarks from here: https://github.com/rrnewton/MandelMicrobenchhttps://github.com/rrnewton/MandelMicrobench [1] P.P.S. Most concerning to me about Haskell/C++ comparisons are David Peixotto's findings that LLVM optimizations are not very effective on Haskell-generated LLVM compared with typical clang-generated LLVM. [2] P.P.P.S. It turns out there was already a ticket ( http://hackage.haskell.org/trac/ghc/ticket/2450) regarding magnitude's performance. But it still has bad performance even after a small refactoring was performed. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Safe Haskell at the export symbol granularity?
Separate from whether or not we actually want this -- is it feasible? Here's my situation. When working on parallel programming libraries in Haskell there are very often unsafe operations one wants to do within an otherwise pure model. For example, Accelerate currently violates safe haskell because it trusts the user to provide an associative function to parallel fold. No associativity, no referential transparency. The solution is to put fold in a separate namespace and mark that module as Unsafe. Likewise for certain monad-par operations that are unsafe. But this factoring can have a serious impact. Not only are extra modules required, but extra type classes as well. For example, if Accelerate is ever refactored for Safe Haskell then the following ParAccelerate type class probably should be as well: https://github.com/simonmar/monad-par/blob/5cc656bc45dc473d7a185ec99bb156192f54d520/abstract-par-accelerate/Control/Monad/Par/Accelerate.hs#L75 I.e. ParAccelerate ParAccelerateUnsafe for the unsafeHybrid operation. But this starts to be death by a thousand organizational factorings! - The package, abstract-par-accelerate, is already factored out from abstract-par just to avoid an unnecessary Accelerate dependency (which used to mean CUDA errors). (And *another* factoring is possibly warranted for whether or not the module depends on accelerate-io.) - The file would be separate to mark it as Safe Haskell. - The type class ParAccelerateUnsafe would be separate so as to put it in a separate file. What's a possible solution? If, in addition to Safe and Unsafe modules, there were Partially Safe modules, which exported a mix of safe and unsafe identifiers, then this could all be much cleaner. The simple rule is that any reference whatsoever to an unsafe identifier disqualifies the client code. For example, in the above ParAccelerate type class we would mark the unsafeHybrid binding with something like {-# UNSAFE unsafeHybrid #-}. We wouldn't even have to factor it out of the type class. Likewise, Accelerate could mark fold as unsafe (providing safe variants as well) without introducing module namespace bloat and confusion. What do you think? -Ryan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] AI - machine learning
Hi All, recently I started to take a look at haskell, especially at AI. I can see some email addresses of interested people there but not so much of other activity behind. Does it exist some mailing group especially for AI? Basically I'm interested in trying some machine learning algorithms. Start with reinforcement learning and value-based), and go towards AGI (Artificial General Intelligence). Does anybody know about some already existing haskell approaches, or is there anybody working on this? Cheers, m. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Can Haskell outperform C++?
Ryan Newton: But, anyway, it turns out that my example above is easily transformed from a bad GHC performance story into a good one. If you'll bear with me, I'll show how below. First, Manuel makes a good point about the LLVM backend. My 6X anecdote was from a while ago and I didn't use llvm [1]. I redid it just now with 7.4.1+LLVM, results below. (The below table should read correctly in fixed width font, but you can also see the data in the spreadsheet here.) Time (ms) Compiled File size Comple+Runtime (ms) GHC 7.4.1 O0 24441241K GHC 7.4.1 O2 925 1132K1561 GHC 7.4.1 O2 llvm 931 1133K GHC 7.0.4 O2 via-C 684 974K So LLVM didn't help [1]. And in fact the deprecated via-C backend did the best! I would classify that as a bug. [1] P.P.S. Most concerning to me about Haskell/C++ comparisons are David Peixotto's findings that LLVM optimizations are not very effective on Haskell-generated LLVM compared with typical clang-generated LLVM. There is some work underway to improve the situation, but I am sure more could be done. Manuel ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe