[Haskell-cafe] [ANN] Accelerate version 0.12: GPU computing with Haskell

2012-05-14 Thread Manuel M T Chakravarty
We just released version 0.12 of Data.Array.Accelerate, the GPGPU[1] library 
for Haskell:

  http://justtesting.org/gpu-accelerated-array-computations-in-haskell

This is a beta release. The library is not perfect, but it is definitely 
usable, and we are looking for early adopters.

Manuel

[1] Currently only NVIDIA GPUs are supported via NVIDIA's CUDA framework.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Can Haskell outperform C++?

2012-05-14 Thread Dmitry Vyal

On 05/11/2012 07:53 AM, Ertugrul Söylemez wrote:
My point is: If you need C-like performance at a certain spot there is 
really no excuse for writing the entire application in C. Haskell has 
a working, powerful enough FFI. Also idiomatic Haskell code nowadays 
performs close to C. If your code doesn't, chances are that it's not 
even idiomatic. Sorting a list is easy and beautiful in code. But it's 
far from idiomatic. Using ST with destructive update is also not 
idiomatic. One of my performance masterpieces is the instinct AI 
library (see Hackage). It uses only immutable vectors and performs 
very heavy Double calculations, yet performs better than the same code 
with mutable arrays did. With a few years of Haskell experience in my 
backpack I know how to utilize laziness to get amazing performance for 
code that most people would feel must be written with destructively 
updating loop. And the resulting code is just beautiful to read and 
watch at work, a great demonstration of the wonders the GHC developers 
have created.

Hello Ertugrul,

Out of the curios, did you compare the performance of Instinct with 
implementations in languages, associated with numerical computations, 
like Octave?


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] darcs patch dependencies in dot format

2012-05-14 Thread Ketil Malde
Sönke Hahn sh...@cs.tu-berlin.de writes:

 On 05/13/2012 03:13 AM, Felipe Almeida Lessa wrote:
 Truly amazing! 

Yes, nice work!

 I wonder it would fare with larger repositories. =)

 It does not scale well. [...]
 Somehow related questions are: What am I going to do with a dot-graph,
 that has more than 500 vertices? Is there an intelligent way to reduce
 the graph?

Lacking a solution for the problem of drawing large graphs, making the
graph smaller might be the second choice. :-) One option might be to
only track dependencies back to a specified tag? Or between tags? 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bug in SYB, SYB-documentation or GHC-API (or I messed up somewhere)

2012-05-14 Thread Philip K . F . Hölzenspies
Dear Pedro,


On 13 May 2012, at 20:29, José Pedro Magalhães wrote:
 On Fri, May 11, 2012 at 5:12 PM, Philip K. F. Hölzenspies 
 p...@st-andrews.ac.uk wrote:
 
 However, it turns out that this never leads to an evaluation of 
 extTyNamesLoc. I've come as far as to see that, since Located a is a 
 synonym for GenLocated SrcSpan a, the type we're looking for really has a 
 binary constructor, thus we should use ext2Q.
 
 This made things work: We first modify the function we apply to Located 
 things:
 
 
 extTyNamesLoc :: (Data loc, Data a) = SrcSpan - GenLocated loc a - 
 OccurrenceTable
 extTyNamesLoc l (L l' x) = case cast l' of
   Just l'' - extTyNames l'' x
   Nothing - extTyNames l x
 
 Do you really need this? Can't you use the definition of `extTyNamesLoc` 
 shown previously, and redefine `extTyNames` to use `ext2Q`, as in:

As it turns out, yes I do. You had me doubting whether I had tried this, but 
doing such a thing gives me:


refact.hs:124:72:
Could not deduce (d1 ~ SrcSpan)
from the context (Data a)
  bound by the type signature for
 extTyNames :: Data a = SrcSpan - RefCtx - a - OccTab
  at refact.hs:124:1-91
or from (Data d1, Data d2)
  bound by a type expected by the context:
 (Data d1, Data d2) = GenLocated d1 d2 - OccTab
  at refact.hs:124:21-88
  `d1' is a rigid type variable bound by
   a type expected by the context:
 (Data d1, Data d2) = GenLocated d1 d2 - OccTab
   at refact.hs:124:21
Expected type: GenLocated d1 d2 - OccTab
  Actual type: Located d2 - OccTab
In the return type of a call of `extTyNamesLoc'
In the second argument of `ext2Q', namely `extTyNamesLoc l r'


Which makes a certain amount of sense. The use of ext2Q binds d1 and d2 to be 
restricted to Data, but otherwise unspecific. This seems at odds with the 
intention of extXY, hence my confusion.

I have since found and come to understand extB, though. This makes the 
implementation of extTyNamesLoc much, much nicer. However, I still think 
finding a good explanation for the behaviour of ext1Q/ext2Q could lead to a 
helpful addendum of the SYB documentation.

Regards,
Philip___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Can Haskell outperform C++?

2012-05-14 Thread Ertugrul Söylemez
Dmitry Vyal akam...@gmail.com wrote:

  My point is: If you need C-like performance at a certain spot there
  is really no excuse for writing the entire application in C.
  Haskell has a working, powerful enough FFI. Also idiomatic Haskell
  code nowadays performs close to C. If your code doesn't, chances are
  that it's not even idiomatic. Sorting a list is easy and beautiful
  in code. But it's far from idiomatic. Using ST with destructive
  update is also not idiomatic. One of my performance masterpieces is
  the instinct AI library (see Hackage). It uses only immutable
  vectors and performs very heavy Double calculations, yet performs
  better than the same code with mutable arrays did.  With a few years
  of Haskell experience in my backpack I know how to utilize laziness
  to get amazing performance for code that most people would feel must
  be written with destructively updating loop.  And the resulting code
  is just beautiful to read and watch at work, a great demonstration
  of the wonders the GHC developers have created.

 Out of the curios, did you compare the performance of Instinct with
 implementations in languages, associated with numerical computations,
 like Octave?

No, sorry.


Greets,
Ertugrul

-- 
Key-ID: E5DD8D11 Ertugrul Soeylemez e...@ertes.de
FPrint: BD28 3E3F BE63 BADD 4157  9134 D56A 37FA E5DD 8D11
Keysrv: hkp://subkeys.pgp.net/


signature.asc
Description: PGP signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] darcs patch dependencies in dot format

2012-05-14 Thread Simon Michael

On 5/12/12 5:52 AM, Sönke Hahn wrote:

Hi all!

Yesterday I wrote a little tool to output the dependencies of darcs
patches in dot format. The hardest part was to wrap my head around the
darcs API and find a way to let it compute the patch dependencies. I
don't know, if I got it right, but it looks correct at first glance.

Here is the code:

https://patch-tag.com/r/shahn/darcs2dot

To use it just execute the program in a darcs repo and it will output a
dot graph to stdout. The graph does not contain all dependencies, but
the transitive reduction. The patch names are truncated at 15 characters.

And here is an example graph:

http://open-projects.net/~shahn/patchDeps.pdf

These are the patch dependencies of one of my darcs repos
(https://patch-tag.com/r/shahn/hate). Some observations I made:

* There are two completely separate subgraphs. One subgraph seems to be
for the testsuite, the other for the actual code. This means, the two
don't depend on each other and could easily be put in two distinct repos.
* There is a re-implementation patch with a lot of incoming and
outgoing edges. (Which makes sense.)
* There are some sequences of patches (e.g. these four allow ...
patches in the upper left corner) that seem to contain related patches.
* darcs's patch system is awesome!

Any comments or suggestions?

Cheers,
Sönke


That's nifty, thanks for sharing it. Cc'ing darcs-user.

I tried it in a few repos, like so:

$ darcs2dot  patchdeps.dot  dot patchdeps.dot -Tpdf -o patchdeps.pdf

In a 200-patch repo it ran quickly, giving: 
http://joyful.com/darcsden/simon/darcsum/raw/patchdeps.pdf

In a 2000-patch repo it took 22 hours: 
http://joyful.com/darcsden/simon/hledger/raw/patchdeps.pdf

In the 1-patch Darcs repo I killed it after a few hours, but here are the 
early dependencies (up to tag 1.0.0rc2):
http://joyful.com/darcsden/simon/darcs-sm/raw/patchdeps.pdf

It should escape double-quotes in patch names, I did that manually.

I wonder how to simplify the graph further. Perhaps just the dependencies of 
tags would be interesting in some repos ?

Best,
-Simon


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Can Haskell outperform C++?

2012-05-14 Thread Ryan Newton

 Well, if it's in many ways the same as C, then again it's probably not
 idiomatic Haskell.


It's just a recursive equation for mandelbrot fractals.  I should have been
precise, I didn't mean that the source is literally the *same* as the C
source (i.e. there's no for loop, no mutable variables), rather that it
presents the compiler backend with the same advantages as the C backend
would have with the equivalent loop.  That is, there's no control flow
obfuscation (dictionaries, calls to unknown targets), no problems with
laziness, and the data representations are completely known.

mandel :: Int - Complex Double - Int
mandel max_depth c = loop 0 0

  where

   loop i !z

| i == max_depth  = i

| magnitude z = 2.0  = i

| otherwise   = loop (i+1) (z*z + c)


It's also a loop that is readily recognized as a loop.  Now, to my
knowledge, GHC doesn't explicitly recognize loops in any stage of the
compiler (so as to perform loop optimizations), something which other
functional compilers sometimes do.

But, anyway, it turns out that my example above *is easily transformed from
a bad GHC performance story into a good one*.  If you'll bear with me, I'll
show how below.
   First, Manuel makes a good point about the LLVM backend.  My 6X
anecdote was from a while ago and I didn't use llvm [1].  I redid it just
now with 7.4.1+LLVM, results below.  (The below table should read correctly
in fixed width font, but you can also see the data in the spreadsheet
herehttps://docs.google.com/spreadsheet/ccc?key=0AvzAHqQmHo87dHU0T0lCb1I4MFJmM2s4RnNlamJlNkE
.)

   Time (ms)   Compiled File size   Comple+Runtime (ms)
GHC 7.4.1 O024441241K
GHC 7.4.1 O29251132K 1561
GHC 7.4.1 O2 llvm  931 1133K
GHC 7.0.4 O2 via-C 684 974K

So LLVM didn't help [1].  And in fact the deprecated via-C backend did the
best!  Compare with GCC:

G++ O0 300 9K   531
G++ O3 110 7K   347
G++ O3 recursive   116 9K

Uh oh, the 6X gap I mentioned is now closer to 9X.  And, in fact,
performing a mini language shootout on the above code, reveals that GHC
is doing worse than not only OCaml, but Chez Scheme, in spite of dynamic
type checks, and a necessarily boxed representation of complex numbers:

Chez Scheme 8.4284 2.7K notStandalone   372
OCaml  166 180K 301

At least Python does worse!

Python 2.6 1973NA   1973

*So here's the catch:*  If you check the Core and STG GHC 7.4 is actually
compiling the above loop very well.  This microbenchmark turns into just a
magnitude microbenchmark.  The implementation of Data.Complex uses an
EXTREMELY expensive method to avoid
overflowhttps://github.com/ghc/packages-base/blob/master/Data/Complex.hs#L115
 [2].

Since OCaml did well above, I took a look at their standard library's
implementation, on line 51
herehttp://caml.inria.fr/cgi-bin/viewvc.cgi/ocaml/trunk/stdlib/complex.ml?revision=11156view=markup.
 They use a nice little math trick (the extra division) that is also
mentioned on Wikipedia.  By implementing the same trick in Haskell,
replacing the standard magnitude
functionhttps://github.com/rrnewton/MandelMicrobench/blob/97c3275ad94cbce57a688817332b42f7c32c15b4/mandel_test2.hs,
we get the following results.

GHC 7.4.1 No
Overflow Avoidance   39 1127K674
GHC 741, OCaml-style
Overflow avoidance   74  1127K

Wow, now not only is the Haskell version faster than OCaml/Scheme, *but it
is 48% faster than C++*, which is appropriate to the title of this email!
 Of course, this probably means that C++'s abs is also doing something
overly expensive for overflow avoidance (or that their representation of
complex numbers is not fully unpacked and stored in registers)  I haven't
tracked it down yet.

But in any case, I'll be submitting a library patch.  *The moral, I think,
is that community members could do a great deal to help Haskell
performance by simply microbenchmarking and optimizing library routines in
base!*

Cheers,
  -Ryan

P.S. You can check out the above benchmarks from here:
 
https://github.com/rrnewton/MandelMicrobenchhttps://github.com/rrnewton/MandelMicrobench

[1] P.P.S. Most concerning to me about Haskell/C++ comparisons are David
Peixotto's findings that LLVM optimizations are not very effective on
Haskell-generated LLVM compared with typical clang-generated LLVM.

[2]  P.P.P.S. It turns out there was already a ticket (
http://hackage.haskell.org/trac/ghc/ticket/2450) regarding magnitude's
performance.  But it still has bad performance even after a small
refactoring was performed.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Safe Haskell at the export symbol granularity?

2012-05-14 Thread Ryan Newton
Separate from whether or not we actually want this -- is it feasible?

Here's my situation.  When working on parallel programming libraries in
Haskell there are very often unsafe operations one wants to do within an
otherwise pure model.  For example, Accelerate currently violates safe
haskell because it trusts the user to provide an associative function to
parallel fold.  No associativity, no referential transparency.

The solution is to put fold in a separate namespace and mark that module as
Unsafe.  Likewise for certain monad-par operations that are unsafe.  But
this factoring can have a serious impact.  Not only are extra modules
required, but extra type classes as well.  For example, if Accelerate is
ever refactored for Safe Haskell then the following ParAccelerate type
class probably should be as well:

https://github.com/simonmar/monad-par/blob/5cc656bc45dc473d7a185ec99bb156192f54d520/abstract-par-accelerate/Control/Monad/Par/Accelerate.hs#L75

I.e. ParAccelerate  ParAccelerateUnsafe for the unsafeHybrid operation.

But this starts to be death by a thousand organizational factorings!

   - The package, abstract-par-accelerate, is already factored out from
   abstract-par just to avoid an unnecessary Accelerate dependency (which
   used to mean CUDA errors).  (And *another* factoring is possibly warranted
   for whether or not the module depends on accelerate-io.)
   - The file would be separate to mark it as Safe Haskell.
   - The type class ParAccelerateUnsafe would be separate so as to put it
   in a separate file.

What's a possible solution?  If, in addition to Safe and Unsafe
modules, there were Partially Safe modules, which exported a mix of safe
and unsafe identifiers, then this could all be much cleaner.

The simple rule is that any reference whatsoever to an unsafe identifier
disqualifies the client code.  For example, in the above ParAccelerate type
class we would mark the unsafeHybrid binding with something like {-# UNSAFE
unsafeHybrid #-}.  We wouldn't even have to factor it out of the type class.

Likewise, Accelerate could mark fold as unsafe (providing safe variants
as well) without introducing module namespace bloat and confusion.

What do you think?

   -Ryan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] AI - machine learning

2012-05-14 Thread miro
Hi All, recently I started to take a look at haskell, especially at AI. 
I can see some email addresses of interested people there but not so 
much of other activity behind. Does it exist some mailing group 
especially for AI?


Basically I'm interested in trying some machine learning algorithms. 
Start with reinforcement learning and value-based), and go towards AGI 
(Artificial General Intelligence). Does anybody know about some already 
existing haskell approaches, or is there anybody working on this?


Cheers,
m.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Can Haskell outperform C++?

2012-05-14 Thread Manuel M T Chakravarty
Ryan Newton:
 But, anyway, it turns out that my example above is easily transformed from a 
 bad GHC performance story into a good one.  If you'll bear with me, I'll show 
 how below.
First, Manuel makes a good point about the LLVM backend.  My 6X anecdote 
 was from a while ago and I didn't use llvm [1].  I redid it just now with 
 7.4.1+LLVM, results below.  (The below table should read correctly in fixed 
 width font, but you can also see the data in the spreadsheet here.)
 
Time (ms)   Compiled File size   Comple+Runtime (ms)
 GHC 7.4.1 O0 24441241K
 GHC 7.4.1 O2 925 1132K1561
 GHC 7.4.1 O2 llvm  931 1133K
 GHC 7.0.4 O2 via-C 684 974K
 
 So LLVM didn't help [1].  And in fact the deprecated via-C backend did the 
 best!  

I would classify that as a bug.

 [1] P.P.S. Most concerning to me about Haskell/C++ comparisons are David 
 Peixotto's findings that LLVM optimizations are not very effective on 
 Haskell-generated LLVM compared with typical clang-generated LLVM.

There is some work underway to improve the situation, but I am sure more could 
be done.

Manuel

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe