subject:"RE\: \[Haskell\-cafe\] I love purity, but it's killing me."

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread Sebastiaan Visser

On May 27, 2009, at 1:49 AM, Conal Elliott wrote:

Hi Tom,

I've been working on another code-generating graphics compiler,
generating GPU code. As always, I run into the problem of efficient
common subexpression elimination. In Pan, Vertigo Pajama, I used
lazy memoization, using stable pointers and weak references, to
avoid the worst-case-exponential behavior you mention below. I'm
now using a bottom-up CSE method that's slower and more complicated
than I'm going for.

What do you mean with `exponential behavior'? Exponential related to
what?

For my FRP EDSL to JavaScript (toy) compiler[1] I've been implementing
CSE as well. I traverses the expression tree recursively and creates
an small intermediate language containing id's (pointers) to
expressions instead of real sub-expressions.

Maybe (probably) I am very naive, but I think this trick takes time
linear to the amount of sub-expressions in my script. When using a
trie instead of a binary tree for the comparisons there should be no
more character (or atomic expression) comparisons that the amount of
characters in the script.

So the problem seems not to be CSE algorithm, but the fact that EDSL
itself tends to blow up because it is hosted in Haskell. Like Tom's
example:

let d = Add c c
e = Add d d-- e now as 16 leaf nodes.

But again, I might be missing some important point here.

What's your latest wisdom about CSE in DSELs?

Thanks, - Conal

On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins tomahawk...@gmail.com
wrote:

...

--
Sebastiaan Visser

(warning: messy code)
[1]
http://github.com/sebastiaanvisser/frp-js/blob/b4f37d3b564c4932a3019b9b580e6da9449768a8/src/Core/Compiler.hs
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

RE: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread Sittampalam, Ganesh

Sebastiaan Visser wrote:
 On May 27, 2009, at 1:49 AM, Conal Elliott wrote:
 Hi Tom,
 
 I've been working on another code-generating graphics compiler,
 generating GPU code.  As always, I run into the problem of efficient
 common subexpression elimination.  In Pan, Vertigo  Pajama, I used
 lazy memoization, using stable pointers and weak references, to avoid
 the worst-case-exponential behavior you mention below.  I'm now using
 a bottom-up CSE method that's slower and more complicated than I'm
 going for.
 
 What do you mean with `exponential behavior'? Exponential related to
 what? 
 
 For my FRP EDSL to JavaScript (toy) compiler[1] I've been
 implementing CSE as well. I traverses the expression tree recursively
 and creates an small intermediate language containing id's (pointers)
 to expressions instead of real sub-expressions.   
 
 Maybe (probably) I am very naive, but I think this trick takes time
 linear to the amount of sub-expressions in my script. When using a
 trie instead of a binary tree for the comparisons there should be no
 more character (or atomic expression) comparisons that the amount of
 characters in the script.
 
 So the problem seems not to be CSE algorithm, but the fact that EDSL
 itself tends to blow up because it is hosted in Haskell. Like Tom's 
 example:
 
   let d = Add c c
   e = Add d d-- e now as 16 leaf nodes.
 
 But again, I might be missing some important point here.

That's exactly right. But it's pretty inconvenient to have your
expression tree to blow up exponentially in relation to the code the
user actually wrote! You can indeed construct an intermediate language
that collapses this blowup, but the pass to create it must take
exponential time if written completely purely, since it has to visit
everything at least once.

In my experience [1], observable sharing using GHC's stable names is a
pretty effective solution to this problem.

Ganesh

[1] http://www.earth.li/~ganesh/research/paradise-icfp08/

=== 
 Please access the attached hyperlink for an important electronic 
communications disclaimer: 
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 
 
=== 
 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread Sebastiaan Visser


On May 27, 2009, at 12:51 PM, Sittampalam, Ganesh wrote:

Sebastiaan Visser wrote:

...

But again, I might be missing some important point here.


That's exactly right. But it's pretty inconvenient to have your
expression tree to blow up exponentially in relation to the code the
user actually wrote! You can indeed construct an intermediate language
that collapses this blowup, but the pass to create it must take
exponential time if written completely purely, since it has to visit
everything at least once.

In my experience [1], observable sharing using GHC's stable names is a
pretty effective solution to this problem.

Ganesh

[1] http://www.earth.li/~ganesh/research/paradise-icfp08/



Thanks, I just pushed your paper on top of my stack.

--
Sebastiaan Visser



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread Conal Elliott

Hi Wren,

I considered the idea of hashing, but not *perfect* hashing.  I don't know
how to hash perfectly with something like expressions, which have infinitely
many values.

Since it's stateful, that means the smart constructors may need to be in an
 appropriate monad/applicative for passing the memo table around (some hash
 functions may not need to store the table explicitly).


Hm -- stateful?  Unless I'm misunderstanding, a stateful 
monadic/applicative approach would break the simple functional interface I'm
going for.  Could well be I haven't formed a mental picture that matches
yours.

  - Conal

On Tue, May 26, 2009 at 5:23 PM, wren ng thornton w...@freegeek.org wrote:

 Conal Elliott wrote:

 Hi Tom,

 I've been working on another code-generating graphics compiler, generating
 GPU code.  As always, I run into the problem of efficient common
 subexpression elimination.  In Pan, Vertigo  Pajama, I used lazy
 memoization, using stable pointers and weak references, to avoid the
 worst-case-exponential behavior you mention below.  I'm now using a
 bottom-up CSE method that's slower and more complicated than I'm going
 for.

 What's your latest wisdom about CSE in DSELs?

 Thanks,  - Conal



 One common trick that Tom didn't seem to mention in the 2008-02-07T23:33
 post is hash cons'ing.

 Given a perfect hash function, traverse the term bottom-up storing each
 (hash,subterm) pair in a memo table and replacing the subterm by its hash.
 Once that's done, equality checks are trivial, and the memotable can be
 converted to SSA rather easily.

 This works best if you amortize the memoization by doing it with smart
 constructors, so that you don't need to worry about the exponential
 duplication of work for expressions with DAGy structure sharing in the
 Haskell. Since it's stateful, that means the smart constructors may need to
 be in an appropriate monad/applicative for passing the memo table around
 (some hash functions may not need to store the table explicitly).

 Maybe this is the too-slow too-complex solution you're using already?

 --
 Live well,
 ~wren

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread Conal Elliott

What do you mean with `exponential behavior'? Exponential related to what?

I mean that the size of the observable tree can be exponential in the size
of the unobservable dag representation.

So the problem seems not to be CSE algorithm, but the fact that EDSL itself
tends to blow up because it is hosted in Haskell.

In other words, the tree size blows up, and hosting in pure Haskell doesn't
allow us to examine the compact dag.

Are we on the same track now?

- Conal

On Wed, May 27, 2009 at 3:15 AM, Sebastiaan Visser sfvis...@cs.uu.nlwrote:

On May 27, 2009, at 1:49 AM, Conal Elliott wrote:

Hi Tom,

I've been working on another code-generating graphics compiler, generating
GPU code. As always, I run into the problem of efficient common
subexpression elimination. In Pan, Vertigo Pajama, I used lazy
memoization, using stable pointers and weak references, to avoid the
worst-case-exponential behavior you mention below. I'm now using a
bottom-up CSE method that's slower and more complicated than I'm going for.

What do you mean with `exponential behavior'? Exponential related to what?

For my FRP EDSL to JavaScript (toy) compiler[1] I've been implementing CSE
as well. I traverses the expression tree recursively and creates an small
intermediate language containing id's (pointers) to expressions instead of
real sub-expressions.

Maybe (probably) I am very naive, but I think this trick takes time linear
to the amount of sub-expressions in my script. When using a trie instead of
a binary tree for the comparisons there should be no more character (or
atomic expression) comparisons that the amount of characters in the script.

So the problem seems not to be CSE algorithm, but the fact that EDSL itself
tends to blow up because it is hosted in Haskell. Like Tom's example:

let d = Add c c
e = Add d d-- e now as 16 leaf nodes.

But again, I might be missing some important point here.

What's your latest wisdom about CSE in DSELs?

Thanks, - Conal

On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins tomahawk...@gmail.com
wrote:

...

--
Sebastiaan Visser

(warning: messy code)
[1]
http://github.com/sebastiaanvisser/frp-js/blob/b4f37d3b564c4932a3019b9b580e6da9449768a8/src/Core/Compiler.hs

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread Conal Elliott

In my experience [1], observable sharing using GHC's stable names is a
pretty effective solution to this problem.

Plus unsafePerformIO and weak references as in *Stretching the storage
manager: weak pointers and stable names in
Haskellhttp://citeseer.ist.psu.edu/peytonjones99stretching.html
*?

Lacking a more elegant alternative, that's what I'll probably do again, as
in Pan, Vertigo, and Pajama.

- Conal

On Wed, May 27, 2009 at 3:51 AM, Sittampalam, Ganesh
ganesh.sittampa...@credit-suisse.com wrote:

Sebastiaan Visser wrote:
On May 27, 2009, at 1:49 AM, Conal Elliott wrote:
Hi Tom,

I've been working on another code-generating graphics compiler,
generating GPU code. As always, I run into the problem of efficient
common subexpression elimination. In Pan, Vertigo Pajama, I used
lazy memoization, using stable pointers and weak references, to avoid
the worst-case-exponential behavior you mention below. I'm now using
a bottom-up CSE method that's slower and more complicated than I'm
going for.

What do you mean with `exponential behavior'? Exponential related to
what?

For my FRP EDSL to JavaScript (toy) compiler[1] I've been
implementing CSE as well. I traverses the expression tree recursively
and creates an small intermediate language containing id's (pointers)
to expressions instead of real sub-expressions.

So the problem seems not to be CSE algorithm, but the fact that EDSL
itself tends to blow up because it is hosted in Haskell. Like Tom's
example:

let d = Add c c
e = Add d d-- e now as 16 leaf nodes.

But again, I might be missing some important point here.

That's exactly right. But it's pretty inconvenient to have your
expression tree to blow up exponentially in relation to the code the
user actually wrote! You can indeed construct an intermediate language
that collapses this blowup, but the pass to create it must take
exponential time if written completely purely, since it has to visit
everything at least once.

In my experience [1], observable sharing using GHC's stable names is a
pretty effective solution to this problem.

Ganesh

[1]
http://www.earth.li/~ganesh/research/paradise-icfp08/http://www.earth.li/%7Eganesh/research/paradise-icfp08/

===
Please access the attached hyperlink for an important electronic
communications disclaimer:
http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html

===

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread Conal Elliott

I just remembered: Andy Gill has a new paper Type Directed Observable
Sharing (http://www.ittc.ku.edu/~andygill/paper.php?label=DSLExtract09)
that looks very relevant.

Abstract:

Haskell is a great language for writing and supporting embedded Domain
 Specific Languages (DSLs). Some form of observable sharing is often a
 critical capability for allowing so-called deep DSLs to be compiled and
 processed. In this paper, we describe and explore uses of an IO function for
 reification which allows direct observation of sharing.




On Tue, May 26, 2009 at 4:49 PM, Conal Elliott co...@conal.net wrote:

 Hi Tom,

 I've been working on another code-generating graphics compiler, generating
 GPU code.  As always, I run into the problem of efficient common
 subexpression elimination.  In Pan, Vertigo  Pajama, I used lazy
 memoization, using stable pointers and weak references, to avoid the
 worst-case-exponential behavior you mention below.  I'm now using a
 bottom-up CSE method that's slower and more complicated than I'm going for.

 What's your latest wisdom about CSE in DSELs?

 Thanks,  - Conal


 On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins tomahawk...@gmail.comwrote:

 I've been programming with Haskell for a few years and love it.  One
 of my favorite applications of Haskell is using for domain specific
 languages.  However, after designing a handful of DSLs, I continue to
 hit what appears to be a fundamental hurdle -- or at least I have yet
 to find an adequate solution.

 My DSLs invariably define a datatype to capture expressions; something
 like this:

 data Expression
  = Add Expression Expression
  | Sub Expression Expression
  | Variable String
  | Constant Int
  deriving Eq

 Using the datatype Expression, it is easy to mass a collections of
 functions to help assemble complex expressions, which leads to very
 concise programs in the DSL.

 The problem comes when I want to generate efficient code from an
 Expression (ie. to C or some other target language).  The method I use
 invovles converting the tree of subexpressions into an acyclic graphic
 to eliminate common subexpressions.  The nodes are then topologically
 ordered and assigned an instruction, or statement for each node.  For
 example:

 let a = Add (Constant 10) (Variable i1)
b = Sub (Variable i2) (Constant 2)
c = Add a b

 would compile to a C program that may look like this:

  a = 10 + i1;
  b = i2 - 2;
  c = a + b;

 The process of converting an expression tree to a graph uses either Eq
 or Ord (either derived or a custom instance) to search and build a set
 of unique nodes to be ordered for execution.  In this case a, then
 b, then c.  The problem is expressions often have shared,
 equivalent subnodes, which dramatically grows the size of the tree.
 For example:

 let d = Add c c
e = Add d d-- e now as 16 leaf nodes.

 As these trees grow in size, the equality comparison in graph
 construction quickly becomes the bottleneck for DSL compilation.
 What's worse, the phase transition from tractable to intractable is
 very sharp.  In one of my DSL programs, I made a seemingly small
 change, and compilation time went from milliseconds to
 not-in-a-million-years.

 Prior to Haskell, I wrote a few DSLs in OCaml.  I didn't have this
 problem in OCaml because each let expression was mutable, and I
 could use the physical equality operator to perform fast comparisons.
 Unfortunately, I have grown to love Haskell's type system and its lack
 of side effects, and could never go back.

 Is there anything that can be done to dramatically speed up
 comparisons, or is there a better approach I can take to extract
 common subexpressions?  I should point out I have an opportunity to
 get Haskell on a real industrial application.  But if I can't solve
 this problem, I may have to resort to far less eloquent languages.
 :-(

 Thanks for any and all help.

 -Tom
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

RE: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread Sittampalam, Ganesh

Yes, though we don't bother with weak pointers as we only keep the
stable names map around for the duration of CSE so there's no ongoing
memory leak issue.

From: haskell-cafe-boun...@haskell.org
[mailto:haskell-cafe-boun...@haskell.org] On Behalf Of Conal Elliott
Sent: 27 May 2009 16:14
To: Sittampalam, Ganesh
Cc: Haskell Cafe
Subject: Re: [Haskell-cafe] I love purity, but it's killing me.

In my experience [1], observable sharing using GHC's stable
names is a pretty effective solution to this problem.

Plus unsafePerformIO and weak references as in Stretching the storage
manager: weak pointers and stable names in Haskell
http://citeseer.ist.psu.edu/peytonjones99stretching.html ?

Lacking a more elegant alternative, that's what I'll probably do again,
as in Pan, Vertigo, and Pajama.

  - Conal

On Wed, May 27, 2009 at 3:51 AM, Sittampalam, Ganesh
ganesh.sittampa...@credit-suisse.com wrote:

Sebastiaan Visser wrote:
 On May 27, 2009, at 1:49 AM, Conal Elliott wrote:
 Hi Tom,

 I've been working on another code-generating graphics
compiler,
 generating GPU code.  As always, I run into the problem of
efficient
 common subexpression elimination.  In Pan, Vertigo  Pajama,
I used
 lazy memoization, using stable pointers and weak references,
to avoid
 the worst-case-exponential behavior you mention below.  I'm
now using
 a bottom-up CSE method that's slower and more complicated
than I'm
 going for.

 What do you mean with `exponential behavior'? Exponential
related to
 what?

 For my FRP EDSL to JavaScript (toy) compiler[1] I've been
 implementing CSE as well. I traverses the expression tree
recursively
 and creates an small intermediate language containing id's
(pointers)
 to expressions instead of real sub-expressions.

 Maybe (probably) I am very naive, but I think this trick takes
time
 linear to the amount of sub-expressions in my script. When
using a
 trie instead of a binary tree for the comparisons there should
be no
 more character (or atomic expression) comparisons that the
amount of
 characters in the script.

 So the problem seems not to be CSE algorithm, but the fact
that EDSL
 itself tends to blow up because it is hosted in Haskell. Like
Tom's
 example:

   let d = Add c c
   e = Add d d-- e now as 16 leaf nodes.

 But again, I might be missing some important point here.

That's exactly right. But it's pretty inconvenient to have your
expression tree to blow up exponentially in relation to the code
the
user actually wrote! You can indeed construct an intermediate
language
that collapses this blowup, but the pass to create it must take
exponential time if written completely purely, since it has to
visit
everything at least once.

In my experience [1], observable sharing using GHC's stable
names is a
pretty effective solution to this problem.

Ganesh

[1] http://www.earth.li/~ganesh/research/paradise-icfp08/
http://www.earth.li/%7Eganesh/research/paradise-icfp08/ 

===
 Please access the attached hyperlink for an important
electronic communications disclaimer:
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html

===

=== 
 Please access the attached hyperlink for an important electronic 
communications disclaimer: 
 http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html 

=== 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-27 Thread wren ng thornton


Conal Elliott wrote:

Hi Wren,

I considered the idea of hashing, but not *perfect* hashing.  I don't know
how to hash perfectly with something like expressions, which have infinitely
many values.


An imperfect hash can work. You'll need a memo table with a source of 
unique symbols (e.g. storing the next unused integer) in order to, 
effectively, make the hash function perfect[1]. If you have a source 
of unique symbols then you can also use a trie, Data.Map, or similar in 
lieu of a hash map.


In a language with pointers (or stable names), the pointer is often used 
as the hash in conjunction with using the memo table as an intern 
table for smart constructors. Thus, client code can never observe that 
structurally equal expressions could have different hashes.


[1]
hash :: HashState m = Expr - m Hash
hash e = case lookupHS e of
 Just h  - return h
 Nothing - do h - nextH
   insertHS e h
   return h


 Since it's stateful, that means the smart constructors may need to be in an
 appropriate monad/applicative for passing the memo table around (some hash
 functions may not need to store the table explicitly).

Hm -- stateful?  Unless I'm misunderstanding, a stateful 
monadic/applicative approach would break the simple functional interface I'm
going for.  Could well be I haven't formed a mental picture that matches
yours.


Er, it's only stateful for the versions above that use pointers or a 
source of unique symbols (since they need to maintain a memo table). If 
you can come up with a perfect hash function[2], then there's no need to 
create/store the memo table at all, since it can be reconstructed on the 
fly. Since perfect hashing often isn't feasible, the stateful 
approximations to a perfect hash function are generally used. Sorry if I 
was unclear.


If you don't mind unsafePerformIO (or similar hacks) then you can hide 
the state from the type system by using the same table for the whole 
program. Generally for hash cons'ing you want your tables to be as large 
as they can be (to maximize sharing) so this shouldn't be problematic. 
However, for languages with scoping it can be beneficial to use separate 
tables to recognize when expressions need to be recomputed; so the 
global store might want to be something like a stack of memo tables with 
fall-through lookup.


I believe Applicative is powerful enough to capture the sort of state 
passing needed since the client code can't ever make decisions based on 
the state. So with smart constructors (to package up the * etc) I'd 
think you should be able to have an EDSL that looks nice, just with a 
more complicated type. Perhaps the issues are with mixing pure Haskell 
functions into the EDSL?


...

The real trick behind hash cons'ing is everywhere substituting the 
hash in for the sub-expression, effectively flattening all expressions 
into a single ply. Thus, expression constructors cons the hashes 
rather than cons'ing expressions. It's similar in spirit to trie'ing, 
but from the bottom up in the same way that dynamic programming is done.


The reason for wanting to do the hashing in smart constructors, as 
opposed to at the end, is to maximize the benefit of dynamic 
programming. If all client-visible expressions are represented by 
hashes, then any structure sharing in the Haskell layer is sharing the 
hash representation, thus you don't need to traverse the shared 
substructure multiple times. (If you hand construct equal expressions 
without sharing, then you'll have to traverse each expression to prove 
that they're equal, but you can use that proof (the hashes) 
thenceforth). For host languages with destructive updates (like 
Smalltalk's become), you can rewrite the subterms as you traverse 
them, so doing it at the end isn't too bad.


If you only expose smart constructors then your Expr type can recurse 
as whatever Hash type. If you do the hashing at the end, then you'll 
need to define a catamorphism on Expr.


...

This is probably similar to what you're doing in Pan, Vertigo, and 
Pajama (I haven't read it). The general technique is elegant in its 
simplicity, and it's not uncommon. Though, like most dynamic programming 
tricks, it seems not to be as widely known as I would assume, so I 
thought I'd mention it in case you've missed it.




[2] Into any domain of terms that can quickly answer (==), namely flat 
terms like Integer. Using a bounded type like Int can give better 
performance guarantees, but there's only so many of them.


--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-26 Thread Conal Elliott

Hi Tom,

I've been working on another code-generating graphics compiler, generating
GPU code.  As always, I run into the problem of efficient common
subexpression elimination.  In Pan, Vertigo  Pajama, I used lazy
memoization, using stable pointers and weak references, to avoid the
worst-case-exponential behavior you mention below.  I'm now using a
bottom-up CSE method that's slower and more complicated than I'm going for.

What's your latest wisdom about CSE in DSELs?

Thanks,  - Conal

On Thu, Feb 7, 2008 at 11:33 PM, Tom Hawkins tomahawk...@gmail.com wrote:

 I've been programming with Haskell for a few years and love it.  One
 of my favorite applications of Haskell is using for domain specific
 languages.  However, after designing a handful of DSLs, I continue to
 hit what appears to be a fundamental hurdle -- or at least I have yet
 to find an adequate solution.

 My DSLs invariably define a datatype to capture expressions; something
 like this:

 data Expression
  = Add Expression Expression
  | Sub Expression Expression
  | Variable String
  | Constant Int
  deriving Eq

 Using the datatype Expression, it is easy to mass a collections of
 functions to help assemble complex expressions, which leads to very
 concise programs in the DSL.

 The problem comes when I want to generate efficient code from an
 Expression (ie. to C or some other target language).  The method I use
 invovles converting the tree of subexpressions into an acyclic graphic
 to eliminate common subexpressions.  The nodes are then topologically
 ordered and assigned an instruction, or statement for each node.  For
 example:

 let a = Add (Constant 10) (Variable i1)
b = Sub (Variable i2) (Constant 2)
c = Add a b

 would compile to a C program that may look like this:

  a = 10 + i1;
  b = i2 - 2;
  c = a + b;

 The process of converting an expression tree to a graph uses either Eq
 or Ord (either derived or a custom instance) to search and build a set
 of unique nodes to be ordered for execution.  In this case a, then
 b, then c.  The problem is expressions often have shared,
 equivalent subnodes, which dramatically grows the size of the tree.
 For example:

 let d = Add c c
e = Add d d-- e now as 16 leaf nodes.

 As these trees grow in size, the equality comparison in graph
 construction quickly becomes the bottleneck for DSL compilation.
 What's worse, the phase transition from tractable to intractable is
 very sharp.  In one of my DSL programs, I made a seemingly small
 change, and compilation time went from milliseconds to
 not-in-a-million-years.

 Prior to Haskell, I wrote a few DSLs in OCaml.  I didn't have this
 problem in OCaml because each let expression was mutable, and I
 could use the physical equality operator to perform fast comparisons.
 Unfortunately, I have grown to love Haskell's type system and its lack
 of side effects, and could never go back.

 Is there anything that can be done to dramatically speed up
 comparisons, or is there a better approach I can take to extract
 common subexpressions?  I should point out I have an opportunity to
 get Haskell on a real industrial application.  But if I can't solve
 this problem, I may have to resort to far less eloquent languages.
 :-(

 Thanks for any and all help.

 -Tom
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-26 Thread wren ng thornton


Conal Elliott wrote:

Hi Tom,

I've been working on another code-generating graphics compiler, generating
GPU code.  As always, I run into the problem of efficient common
subexpression elimination.  In Pan, Vertigo  Pajama, I used lazy
memoization, using stable pointers and weak references, to avoid the
worst-case-exponential behavior you mention below.  I'm now using a
bottom-up CSE method that's slower and more complicated than I'm going for.

What's your latest wisdom about CSE in DSELs?

Thanks,  - Conal



One common trick that Tom didn't seem to mention in the 2008-02-07T23:33 
post is hash cons'ing.


Given a perfect hash function, traverse the term bottom-up storing each 
(hash,subterm) pair in a memo table and replacing the subterm by its 
hash. Once that's done, equality checks are trivial, and the memotable 
can be converted to SSA rather easily.


This works best if you amortize the memoization by doing it with smart 
constructors, so that you don't need to worry about the exponential 
duplication of work for expressions with DAGy structure sharing in the 
Haskell. Since it's stateful, that means the smart constructors may need 
to be in an appropriate monad/applicative for passing the memo table 
around (some hash functions may not need to store the table explicitly).


Maybe this is the too-slow too-complex solution you're using already?

--
Live well,
~wren
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2009-05-26 Thread Tom Hawkins

On Tue, May 26, 2009 at 6:49 PM, Conal Elliott co...@conal.net wrote:
 Hi Tom,

 I've been working on another code-generating graphics compiler, generating
 GPU code.  As always, I run into the problem of efficient common
 subexpression elimination.  In Pan, Vertigo  Pajama, I used lazy
 memoization, using stable pointers and weak references, to avoid the
 worst-case-exponential behavior you mention below.  I'm now using a
 bottom-up CSE method that's slower and more complicated than I'm going for.

 What's your latest wisdom about CSE in DSELs?

I wasn't able to find a solution that offered both performance and
elegance, so I changed the fundamental operation of the DSL (in this
case, atom).  When atom was still a hardware description language, the
compiler would combine several user defined expressions together
resulting in very wide and deep expression trees, resulting in the
same problem you are observing.  But when I switch the target of atom
from HDL to C, the compiler no longer needed to perform the same
expression expansion.  And since the user defined expressions are
generally shallow -- at least in the case of my applications -- atom
is able to get away with exhaustive equality comparison (deriving Eq).

Sorry I can't be of more help.

-Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-10 Thread Matthew Naylor

Hi Tom,

 So is the general strategy with observable sharing to use
 unsafePerformIO with Data.Unique to label expressions at
 construction?

something like that, yes.  Basically, you just need:

  {-# NOINLINE ref #-}
  ref x = unsafePerformIO (newIORef x)

and you can write expressions like

  ref False == ref False

and

  let x = ref False in x == x

However, while referential equality is enough for sharing detection, I
*suspect* it's simpler to use the fact that refs are IORefs and you
can read and write them (in the IO monad).  So a very simple Lava
might look like

  module Lava (Bit,Netlist,low,high,nand2,netlist) where

  import Data.IORef
  import System.IO.Unsafe

  {-# NOINLINE ref #-}
  ref x = unsafePerformIO (newIORef x)

  type Ref = IORef (Maybe Int)

  data Bit = Gate String Ref [Bit]

  type Netlist = [(String,   Int, [Int])]
  --   gate, output,  inputs

  low = Gate low (ref Nothing) []
  high = Gate high (ref Nothing) []
  nand2 (a, b) = Gate nand2 (ref Nothing) [a, b]

  netlist :: Bit - IO Netlist
  netlist x = do i - newIORef (0 :: Int) ; f i x
where
  f i (Gate str r xs) =
do val - readIORef r
   num - readIORef i
   case val of
 Nothing - do writeIORef r (Just num)
   writeIORef i (num+1)
   rest - mapM (f i) xs
   let is = map ((\(g,o,is) - o) . head) rest
   return ((str,num,is):concat rest)
 Just j - return [(indirection,j,[])]  -- explicit sharing!

Indirections can be filtered out at the end, they don't actually give
the netlist any information.

 Of course, now that you have me reading up on Yhc.Core, option #5 is
 looking considerably more fun.

Yeah, I think Yhc.Core is pretty nifty too.  Thank Neil!

Matt.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-09 Thread Bertram Felgenhauer

Matthew Naylor wrote:
[snip]
 Finally, when I say observable sharing, I don't necessarily mean it
 as defined by Koen Claessen and David Sands.  I simply mean the use of
 unsafePerformIO to detect sharing, whether or not this is done by an
 eq predicate on Refs. (I say this because I think there are simpler
 ways to detect sharing, though these will probably not have the nice
 semantic properties of observable sharing.)

ghc actually provides a primop for this:

  reallyUnsafePtrEquality# :: a - a - Int#

Use at your own risk.

Note that you can only check for equality uing that primop. To detect
cycles in data structures efficiently, a total order would be better,
but in the presence of copying garbage collection that's asking too
much.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-09 Thread Felipe Lessa

On Feb 9, 2008 12:34 PM, Bertram Felgenhauer
[EMAIL PROTECTED] wrote:
 ghc actually provides a primop for this:

   reallyUnsafePtrEquality# :: a - a - Int#

 Use at your own risk.

Why is it more than unsafe? 'unsafePerformIO' seems to me a lot
unsafer than 'reallyUnsafePtrEquality#'.

Also, is anybody using 'reallyUnsafePtrEquality#' on a working project?

Cheers,

-- 
Felipe.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-09 Thread Bertram Felgenhauer

Matthew Naylor wrote:
(snip)
 Now, there remains the concern that Haskell's semantics does not
 enforce sharing.  A Haskell compiler is free to change the sharing a
 program at a whim, unknowingly to the programmer who may be relying on
 it in for an efficient program.  However, to my knowledge, it is an
 unwritten rule of Haskell compilers that sharing *is* preserved, and
 that they do perform *graph* reduction.

That is not true anymore for the threaded runtime of ghc. If two
threads demand the same thunk, one of them will usually block, but
there is a small window where both threads can start evaluting
the expression. To prevent this, you'd have to take a lock or
otherwise synchronize the threads upon entering each thunk, which
is prohibitively expensive.

See Haskell on a Shared-Memory Multiprocessor,
http://www.haskell.org/~simonmar/papers/multiproc.pdf

for details, section 3.1 in particular.

Bertram
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-09 Thread Matthew Naylor

Hi Tom,

 In addition to the sharing problem, another shortcoming of Haskell
 DSLs is they can not fully exploit the benefits of algebraic
 datatypes.  Specifically, pattern matching ADTs can only be used to
 control the compile-time configuration of the target, it can't be used
 to describe the target's behavior -- at least for DSLs that generate
 code that executes outside of Haskell's runtime.

you can embed algebraic data types and pattern matching in Haskell
with your own semantics, and retain type inference.  It goes something
like this:

  (nil, (|)) = datatype (cons0 [] \/ cons2 (:))

  map f xs = match xs rules
where
  rules (x, xs) =
[ nil  --  nil
, x | xs  --  f x | map f xs
]

here, map :: (Term a - Term b) - Term [a] - Term [b].

The main issue is that you have to quantify the free variables in
patterns, hence the rules function.  I don't know if this helps you.

 Writing a real compiler would solve both of these problems.  Is there
 any Haskell implementation that has a clean cut-point, from which I
 can start from a fully type-checked, type-annotated intermediate
 representation?

The Yhc.Core library is very simple to use and fairly mature (Neil's
been tweeking it for about 3 years now), but it doesn't meet your
type-annotated requirement. (Untyped core is still pretty useful,
though.)

If you go the real compiler route, would it not make sense to take the
DSL as the source language rather than Haskell?  Or are the DSL and
Haskell quite similar?  Or perhaps you are thinking of a two language
system, where some code is evaluated at compile time by Haskell, and
some is compiled to the target language?  If so, maybe you just want
domain specific syntax inside a Haskell program, in which case the
paper Why it's nice to be quoted: quasiquoting for haskell might be
relevant (it's actually supported in GHC I think).

Anyway, all very thought provoking!

Matt.

P.S.

Tom Hawkins wrote:
 Emil Axelsson wrote:
  I know of a few of ways to express sharing in a pure language:
 
  1) Observable sharing, which, in general, is unsafe.
  2) Using Template Haskell
  3) Matthew Naylor has done some work on expressible sharing, which has
  4) Use a monad (but I'm sure this is what you're trying to avoid).
 
 Or...
 
 5) Forget embedding the DSL, and write a direct compiler.

Taking options 2 or 5 just to solve the sharing problem sounds to me
like a lot of hard work for little reward.  But don't worry, I won't
repeat my observable sharing speech. :-)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-09 Thread Henning Thielemann


On Fri, 8 Feb 2008, Tom Hawkins wrote:

 5) Forget embedding the DSL, and write a direct compiler.

 In addition to the sharing problem, another shortcoming of Haskell
 DSLs is they can not fully exploit the benefits of algebraic
 datatypes.  Specifically, pattern matching ADTs can only be used to
 control the compile-time configuration of the target, it can't be used
 to describe the target's behavior -- at least for DSLs that generate
 code that executes outside of Haskell's runtime.

Also in a pure Haskell library you will try to avoid direct access to
constructors, because the internal data structures might change. Better
are functions that access the internal data of a type, like 'maybe' and
'either' for 'Maybe' and 'Either', respectively.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-09 Thread Alfonso Acosta

On Feb 9, 2008 12:28 AM, Tom Hawkins [EMAIL PROTECTED] wrote:
5) Forget embedding the DSL, and write a direct compiler.

In addition to the sharing problem, another shortcoming of Haskell
DSLs is they can not fully exploit the benefits of algebraic
datatypes. Specifically, pattern matching ADTs can only be used to
control the compile-time configuration of the target, it can't be used
to describe the target's behavior -- at least for DSLs that generate
code that executes outside of Haskell's runtime.

Only partly true. Probably you are not aware of them (I myself learned
about its existence a few days ago) but pattern quasiquoting
(available in GHC's HEAD) can be used for that.

http://www.haskell.org/ghc/dist/current/docs/users_guide/template-haskell.html#th-quasiquotation

Writing a real compiler would solve both of these problems. Is there
any Haskell implementation that has a clean cut-point, from which I
can start from a fully type-checked, type-annotated intermediate
representation?

If you have to write a compiler why not define a language which fits
better with the semantics of the embedded language instead of using
plain Haskell?

The approach you propose has the disadvantages of both the embedded
and the standalone languages. On one hand you have to stick with the
syntax of the host language which may not fit with your exact
semantical requirements and, on the other hand, you cannot take
advantage of all the existing machinery around the host language (you
have to code your own compiler).

Furthermore, the first citizen status of functions make it impossible
(or really difficult at least) to compile EDSL descriptions avoiding
runtime and simply applying a static analysis approach (using Core or
plain Haskell as input).

And thanks for the link to John's paper describing Hydra's use of
Template Haskell. I will definiately consider TH.

Well, TH would be one of those static analysis approaches. Actually,
O'Donell's implementation (which uses an outdated version of Template
Haskell) only works with a small Haskell subset. So using TH you'd
probably be changing the host language anyhow.

Furthermore, the TH approach consists in adding node labels by
preprocessing the EDSL description, making sharing observable. That
makes the original EDSL description inpure. The only difference is
that side effects are added by preprocessing instead of using runtime
unsafe functions.

Some pointers covering the topic:

[1] and [2] summarize what are the alternatives to observe sharing in
Haskell whereas [3] compares the embedded approach vs standalone
approach and advocates the last one.

1) http://www.imit.kth.se/~ingo/MasterThesis/ThesisAlfonsoAcosta2007.pdf
(section 2.4.1 and 3.1)
2) http://www.cs.um.edu.mt/svrg/Papers/csaw2006-01.pdf (section 3)
3) http://web.cecs.pdx.edu/~sheard/papers/secondLook.ps
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-09 Thread Henning Thielemann


On Fri, 8 Feb 2008, Matthew Naylor wrote:

 Now recall that referential transparency lets you replace equals with
 equals without changing the *value produced* by a program.  Note that
 it says nothing about preserving *runtime behaviour*.  Sharing, for
 example, may be lost.  So if you do equational reasoning on function
 f (above), and loose some sharing, then you can only expect that the
 same sharing will also be also lost in the generated program.  As long
 as the generated program computes the same result as it did before,
 referential transparency will be, overall, preserved; it would only be
 lost intermediately.  This is what I mean by safe.

 I think there are degrees of observability. If a Haskell library
immediately talks to a C library and shares resources generated by the
library, then this sharing can be hardly observed and the method is
somehow safe. If you generate a C program with Haskell and write it to a
disk it can be easily observed and people might rely on a particular
resulting C program. If the C program is piped to a C compiler which is
immediately run, then sharing can be hardly observed. Even within Haskell
sharing is somehow observable, the Haskell program could observe the free
memory of the machine and thus it can see a difference between sharing and
duplicated objects.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-09 Thread Tom Hawkins

Hi Matt,

On Feb 9, 2008 1:07 PM, Matthew Naylor [EMAIL PROTECTED] wrote:
 If you go the real compiler route, would it not make sense to take the
 DSL as the source language rather than Haskell?  Or are the DSL and
 Haskell quite similar?

The two are nearly identical.  In fact the only significant difference
between the languages is the semantics of top level monad; it wouldn't
be IO, but something else.  With the syntax the same, it could
leverage much of Haskell's standard library.

 Or perhaps you are thinking of a two language
 system, where some code is evaluated at compile time by Haskell, and
 some is compiled to the target language?

Not necessarily in the same compilation flow, but I can think of
several scenarios where it would be advantageous for code written in
this other language to be pulled into a conventional Haskell program.

 Taking options 2 or 5 just to solve the sharing problem sounds to me
 like a lot of hard work for little reward.  But don't worry, I won't
 repeat my observable sharing speech. :-)

So is the general strategy with observable sharing to use
unsafePerformIO with Data.Unique to label expressions at construction?
 Ahh...clever!  I did not think of this.  Of course, now that you have
me reading up on Yhc.Core, option #5 is looking considerably more fun.

-Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-08 Thread Bulat Ziganshin

Hello Tom,

Friday, February 8, 2008, 9:33:35 AM, you wrote:

 The process of converting an expression tree to a graph uses either Eq
 or Ord (either derived or a custom instance) to search and build a set
 of unique nodes to be ordered for execution.

in similar situation, i've added hash field to each node, initialized
by smart constructor:

data Expression = Add Hash Expression Expression | ...
type Hash=Int

add x y = Add (x*y+1234567) x y
...


-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-08 Thread Henning Thielemann


On Fri, 8 Feb 2008, Tom Hawkins wrote:

 I've been programming with Haskell for a few years and love it.  One
 of my favorite applications of Haskell is using for domain specific
 languages.  However, after designing a handful of DSLs, I continue to
 hit what appears to be a fundamental hurdle -- or at least I have yet
 to find an adequate solution.

It seems to become a FAQ. I think all DSLs suffer from the same problems:
sharing and recursion. I've used wrappers for CSound, SuperCollider,
MetaPost, they all have these problems.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-08 Thread Matthew Naylor

Hi,

(Warning: longish message!)

There is some concern, and rightly so, that observable sharing is
dangerous, and that your Haskell program will explode if you use it,
and perhaps even that anyone who uses it is dirty and should be sent
to matron's for a good scrubbing!  However, when used safely, it is
not a hack, nor even dirty, but a nice, simple solution to an
otherwise nasty problem.  Below I define what I mean by safely.

First consider the circumstances under which obserevable sharing is
useful.  Typically we have some Haskell function f that produces a
symbolic representation of an expression.  For whatever reason, we'd
like to *generate a program* that computes the value of this
expression, rather that computing it in Haskell.  For example, in
Lava, we want to generate a VHDL program so that the expression can be
computed on, say, an FPGA.  In Tom's case, he wants to generate a C
program to compute the expression.  All perfectly reasonable, and in
my opinion, a very powerfull way to use Haskell.

Now recall that referential transparency lets you replace equals with
equals without changing the *value produced* by a program.  Note that
it says nothing about preserving *runtime behaviour*.  Sharing, for
example, may be lost.  So if you do equational reasoning on function
f (above), and loose some sharing, then you can only expect that the
same sharing will also be also lost in the generated program.  As long
as the generated program computes the same result as it did before,
referential transparency will be, overall, preserved; it would only be
lost intermediately.  This is what I mean by safe.

Now, there remains the concern that Haskell's semantics does not
enforce sharing.  A Haskell compiler is free to change the sharing a
program at a whim, unknowingly to the programmer who may be relying on
it in for an efficient program.  However, to my knowledge, it is an
unwritten rule of Haskell compilers that sharing *is* preserved, and
that they do perform *graph* reduction.  Clean, a similar language to
Haskell, indeed has a semantics based on graphs.  So I don't believe
Haskell being non-strict (not necessarily lazy) is a good reason for
not using observable sharing.  Though I do feel better when I compile
without -O. :-)

Finally, when I say observable sharing, I don't necessarily mean it
as defined by Koen Claessen and David Sands.  I simply mean the use of
unsafePerformIO to detect sharing, whether or not this is done by an
eq predicate on Refs. (I say this because I think there are simpler
ways to detect sharing, though these will probably not have the nice
semantic properties of observable sharing.)

Sorry for the somewhat long exposition. :-)

Matt.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-08 Thread Tom Hawkins

On 2/8/08, Emil Axelsson [EMAIL PROTECTED] wrote:
 I know of a few of ways to express sharing in a pure language:

 1) Observable sharing, which, in general, is unsafe.
 2) Using Template Haskell
 3) Matthew Naylor has done some work on expressible sharing, which has
 4) Use a monad (but I'm sure this is what you're trying to avoid).

Or...

5) Forget embedding the DSL, and write a direct compiler.

In addition to the sharing problem, another shortcoming of Haskell
DSLs is they can not fully exploit the benefits of algebraic
datatypes.  Specifically, pattern matching ADTs can only be used to
control the compile-time configuration of the target, it can't be used
to describe the target's behavior -- at least for DSLs that generate
code that executes outside of Haskell's runtime.

Writing a real compiler would solve both of these problems.  Is there
any Haskell implementation that has a clean cut-point, from which I
can start from a fully type-checked, type-annotated intermediate
representation?

And thanks for the link to John's paper describing Hydra's use of
Template Haskell.  I will definiately consider TH.

-Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-08 Thread Tim Chevalier

On 2/8/08, Matthew Naylor [EMAIL PROTECTED] wrote:
 it in for an efficient program.  However, to my knowledge, it is an
 unwritten rule of Haskell compilers that sharing *is* preserved, and
 that they do perform *graph* reduction.  Clean, a similar language to

I'm not sure that programmers ought to be relying on this rule. Sure,
all Haskell compilers I know of preserve sharing and do graph
reduction. But conventional wisdom is not the same thing as an
unwritten rule. Someday, someone might come along and write a Haskell
compiler that isn't based on graph reduction and doesn't preserve
sharing at the implementation level (while still preserving the
informal semantics of Haskell). A programmer who had written code that
failed to compile correctly under this hypothetical compiler would be
a very naughty Haskell programmer indeed.

 Haskell, indeed has a semantics based on graphs.  So I don't believe

Haskell doesn't have a semantics, graph-based or not... or at least
not a formal one, and if not a formal one, I don't know what you mean
:-)

Cheers,
Tim

-- 
Tim Chevalier * http://cs.pdx.edu/~tjc * Often in error, never in doubt
There are no sexist decisions to be made. There are antisexist
decisions to be made. And they require tremendous energy and
self-scrutiny, as well as moral stamina... -- Samuel R. Delany
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-08 Thread Don Stewart

tomahawkins:
 On 2/8/08, Emil Axelsson [EMAIL PROTECTED] wrote:
  I know of a few of ways to express sharing in a pure language:
 
  1) Observable sharing, which, in general, is unsafe.
  2) Using Template Haskell
  3) Matthew Naylor has done some work on expressible sharing, which has
  4) Use a monad (but I'm sure this is what you're trying to avoid).
 
 Or...
 
 5) Forget embedding the DSL, and write a direct compiler.
 
 In addition to the sharing problem, another shortcoming of Haskell
 DSLs is they can not fully exploit the benefits of algebraic
 datatypes.  Specifically, pattern matching ADTs can only be used to
 control the compile-time configuration of the target, it can't be used
 to describe the target's behavior -- at least for DSLs that generate
 code that executes outside of Haskell's runtime.
 
 Writing a real compiler would solve both of these problems.  Is there
 any Haskell implementation that has a clean cut-point, from which I
 can start from a fully type-checked, type-annotated intermediate
 representation?

Taking the output of GHC's intermediate phase, after optimising
leaves you with type checked, optimised, 'Core' -- basically
lambda calculus with extras.

It's a good start if you then want to hand-compile that down.

Extract it with -fext-core

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-07 Thread Alfonso Acosta

As I pointed out a few days ago in another thread, you can benefit
from using Observable sharing [1]

Be warned that Observable sharing is a non-conservative extension of
Haskell and it breaks referential transparency.

[1] http://www.cs.chalmers.se/~koen/pubs/entry-asian99-lava.html

On Feb 8, 2008 7:33 AM, Tom Hawkins [EMAIL PROTECTED] wrote:
 I've been programming with Haskell for a few years and love it.  One
 of my favorite applications of Haskell is using for domain specific
 languages.  However, after designing a handful of DSLs, I continue to
 hit what appears to be a fundamental hurdle -- or at least I have yet
 to find an adequate solution.

 My DSLs invariably define a datatype to capture expressions; something
 like this:

 data Expression
   = Add Expression Expression
   | Sub Expression Expression
   | Variable String
   | Constant Int
   deriving Eq

 Using the datatype Expression, it is easy to mass a collections of
 functions to help assemble complex expressions, which leads to very
 concise programs in the DSL.

 The problem comes when I want to generate efficient code from an
 Expression (ie. to C or some other target language).  The method I use
 invovles converting the tree of subexpressions into an acyclic graphic
 to eliminate common subexpressions.  The nodes are then topologically
 ordered and assigned an instruction, or statement for each node.  For
 example:

 let a = Add (Constant 10) (Variable i1)
 b = Sub (Variable i2) (Constant 2)
 c = Add a b

 would compile to a C program that may look like this:

   a = 10 + i1;
   b = i2 - 2;
   c = a + b;

 The process of converting an expression tree to a graph uses either Eq
 or Ord (either derived or a custom instance) to search and build a set
 of unique nodes to be ordered for execution.  In this case a, then
 b, then c.  The problem is expressions often have shared,
 equivalent subnodes, which dramatically grows the size of the tree.
 For example:

 let d = Add c c
 e = Add d d-- e now as 16 leaf nodes.

 As these trees grow in size, the equality comparison in graph
 construction quickly becomes the bottleneck for DSL compilation.
 What's worse, the phase transition from tractable to intractable is
 very sharp.  In one of my DSL programs, I made a seemingly small
 change, and compilation time went from milliseconds to
 not-in-a-million-years.

 Prior to Haskell, I wrote a few DSLs in OCaml.  I didn't have this
 problem in OCaml because each let expression was mutable, and I
 could use the physical equality operator to perform fast comparisons.
 Unfortunately, I have grown to love Haskell's type system and its lack
 of side effects, and could never go back.

 Is there anything that can be done to dramatically speed up
 comparisons, or is there a better approach I can take to extract
 common subexpressions?  I should point out I have an opportunity to
 get Haskell on a real industrial application.  But if I can't solve
 this problem, I may have to resort to far less eloquent languages.
 :-(

 Thanks for any and all help.

 -Tom
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

2008-02-07 Thread Emil Axelsson


I know of a few of ways to express sharing in a pure language:

1) Observable sharing, which, in general, is unsafe.

  http://www.cs.chalmers.se/~koen/pubs/entry-asian99-lava.html

2) Using Template Haskell

  http://www.dcs.gla.ac.uk/publications/PAPERS/7524/EmbedHDLinTH.ps

3) Matthew Naylor has done some work on expressible sharing, which has 
advantages over both methods above. I don't have any reference though...


4) Use a monad (but I'm sure this is what you're trying to avoid).

/ Emil



On 2008-02-08 07:33, Tom Hawkins wrote:

I've been programming with Haskell for a few years and love it.  One
of my favorite applications of Haskell is using for domain specific
languages.  However, after designing a handful of DSLs, I continue to
hit what appears to be a fundamental hurdle -- or at least I have yet
to find an adequate solution.

My DSLs invariably define a datatype to capture expressions; something
like this:

data Expression
  = Add Expression Expression
  | Sub Expression Expression
  | Variable String
  | Constant Int
  deriving Eq

Using the datatype Expression, it is easy to mass a collections of
functions to help assemble complex expressions, which leads to very
concise programs in the DSL.

The problem comes when I want to generate efficient code from an
Expression (ie. to C or some other target language).  The method I use
invovles converting the tree of subexpressions into an acyclic graphic
to eliminate common subexpressions.  The nodes are then topologically
ordered and assigned an instruction, or statement for each node.  For
example:

let a = Add (Constant 10) (Variable i1)
b = Sub (Variable i2) (Constant 2)
c = Add a b

would compile to a C program that may look like this:

  a = 10 + i1;
  b = i2 - 2;
  c = a + b;

The process of converting an expression tree to a graph uses either Eq
or Ord (either derived or a custom instance) to search and build a set
of unique nodes to be ordered for execution.  In this case a, then
b, then c.  The problem is expressions often have shared,
equivalent subnodes, which dramatically grows the size of the tree.
For example:

let d = Add c c
e = Add d d-- e now as 16 leaf nodes.

As these trees grow in size, the equality comparison in graph
construction quickly becomes the bottleneck for DSL compilation.
What's worse, the phase transition from tractable to intractable is
very sharp.  In one of my DSL programs, I made a seemingly small
change, and compilation time went from milliseconds to
not-in-a-million-years.

Prior to Haskell, I wrote a few DSLs in OCaml.  I didn't have this
problem in OCaml because each let expression was mutable, and I
could use the physical equality operator to perform fast comparisons.
Unfortunately, I have grown to love Haskell's type system and its lack
of side effects, and could never go back.

Is there anything that can be done to dramatically speed up
comparisons, or is there a better approach I can take to extract
common subexpressions?  I should point out I have an opportunity to
get Haskell on a real industrial application.  But if I can't solve
this problem, I may have to resort to far less eloquent languages.
:-(

Thanks for any and all help.

-Tom
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Re: [Haskell-cafe] I love purity, but it's killing me.

RE: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

RE: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

Re: [Haskell-cafe] I love purity, but it's killing me.

29 matches

Site Navigation

Mail list logo

Footer information