Re: [Haskell-cafe] Sub-optimal [code]

2011-02-20 Thread Andrew Coppin

On 16/02/2011 11:09 PM, Max Bolingbroke wrote:


Thinking about it some more, this example is actually quite
interesting because if you *prevent* the list from being floated the
forM gets foldr/build fused into a totally listless optimal loop. It
really does seem like a shame to disable that optimisation because of
the floating... if only the fusion hit before float-out was run.


It seems to me that in this case, I'm using a list when what I actually 
mean is a stream. (Here stream refers to the construct used in the 
stream-fusion package.) I can't actually *want* a physical data 
structure to be constructed - so why am I asking for one?


It sems to me that lots of Haskell code uses lists where it actually 
means streams... The stream-fusion package uses streams to fuse together 
list operations, but I rather suspect it would be cleaner and more 
helpful if people wrote code explicitly in terms of streams in the first 
place, except for the small minority of places where you really do want 
an actual list.


But that's just my opinion...

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-16 Thread James Andrew Cook
On Feb 15, 2011, at 6:05 PM, Daniel Fischer daniel.is.fisc...@googlemail.com 
wrote:

 On Tuesday 15 February 2011 23:29:39, Andrew Coppin wrote:
 
 Ouch! 
 
 I suppose what we could really do with is a combinator that runs a
 monadic action N times, without actually constructing a list N elements
 long in order to do so.
 
 True enough. But I guess nobody¹ bothered yet because there are so many 
 possible designs and most of them are trivial to implement in a line or two 
 (so the pain of writing them repeatedly isn't bad enough).
 
 ¹ Not quite true, there's the monad-loops package on hackage which provides 
 a handful of loops. But not the trivial nTimesDo.
 

Doesn't Control.Monad.replicateM_ do exactly that?

-- James
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-16 Thread Daniel Fischer
On Wednesday 16 February 2011 19:31:05, James Andrew Cook wrote:

 Doesn't Control.Monad.replicateM_ do exactly that?


Yes, right, forgot about that. That would've worked fine in Andrew's case.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-16 Thread Andrew Coppin

On 16/02/2011 06:31 PM, James Andrew Cook wrote:


Doesn't Control.Monad.replicateM_ do exactly that?


10 points to Gryffindore.

(Now, if only there was a version that feeds an integer to the monadic 
action as well... Still, it's not hard to implement.)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-16 Thread Max Bolingbroke
On 16 February 2011 21:51, Andrew Coppin andrewcop...@btinternet.com wrote:
 (Now, if only there was a version that feeds an integer to the monadic
 action as well... Still, it's not hard to implement.)

As simple as:
  forM [1..x] mk_my_action

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-16 Thread Daniel Fischer
On Wednesday 16 February 2011 23:13:10, Max Bolingbroke wrote:
 On 16 February 2011 21:51, Andrew Coppin andrewcop...@btinternet.com 
wrote:
  (Now, if only there was a version that feeds an integer to the monadic
  action as well... Still, it's not hard to implement.)

 As simple as:
   forM [1..x] mk_my_action


The problem with that is that under certain circumstances the list is 
shared in nested loops, which was what caused the thread (it was mapM_ and 
not forM_, but I'd be very surprised if they behaved differently with -O2).

What Andrew wants is a listless forM[_].

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-16 Thread Max Bolingbroke
On 16 February 2011 22:48, Daniel Fischer
daniel.is.fisc...@googlemail.com wrote:
 The problem with that is that under certain circumstances the list is
 shared in nested loops, which was what caused the thread (it was mapM_ and
 not forM_, but I'd be very surprised if they behaved differently with -O2).

Yep - d'oh!

Thinking about it some more, this example is actually quite
interesting because if you *prevent* the list from being floated the
forM gets foldr/build fused into a totally listless optimal loop. It
really does seem like a shame to disable that optimisation because of
the floating... if only the fusion hit before float-out was run.

Max

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-15 Thread Andrew Coppin

I tried -O2 -fno-cse. No difference.

I also tried -O2 -fno-full-laziness. BIG DIFFERENCE.


See also the very old GHC ticket at
http://hackage.haskell.org/trac/ghc/ticket/917


I don't know if that's the problem or not, but it might plausibly be.

Here's the smallest version of the program that I could come up with 
[which still misbehaves]:


module Main (main) where

import System.IO
import System.Random

main = do
  file_batch 01-Uniform random_byte_uniform

random_byte_uniform :: IO Int
random_byte_uniform = randomRIO (0x00, 0xFF)

random_file :: String - Int - IO Int - IO ()
random_file f n rnd = do
  putStrLn $ Save:  ++ f ++  [ ++ show n ++  bytes]
  h - openFile f WriteMode
  hSetBinaryMode h True
  mapM_ (\ _ - rnd = hPutChar h . toEnum) [1..n]
  hClose h

file_batch :: String - IO Int - IO ()
file_batch f rnd =
  mapM_
(\ k -
  mapM_
(\ n -
  random_file
(f ++ - ++ show k ++ x- ++ [n])
(10 * 1024 * 1024 * k)
rnd
)
ABCD
)
[1..4]

If main calls random_file directly, the program seems to work OK, so the 
problem seems to be file_batch. Maybe. I don't really know. I had a go 
at playing with -ddump-simpl, but that just generated a 8 KB file which 
is utterly incomprehensible. (Well, the -O0 variant is just about 
comprehensible. The -O2 variant isn't. But it appears that *everything* 
gets inlined into main...)


If anybody can figure out what's happening here, I'd be interested to know.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-15 Thread Daniel Fischer
On Tuesday 15 February 2011 20:15:54, Andrew Coppin wrote:
  I tried -O2 -fno-cse. No difference.
 
  I also tried -O2 -fno-full-laziness. BIG DIFFERENCE.
 
  See also the very old GHC ticket at
  http://hackage.haskell.org/trac/ghc/ticket/917

 I don't know if that's the problem or not, but it might plausibly be.

 Here's the smallest version of the program that I could come up with
 [which still misbehaves]:

 module Main (main) where

 import System.IO
 import System.Random

 main = do
file_batch 01-Uniform random_byte_uniform

 random_byte_uniform :: IO Int
 random_byte_uniform = randomRIO (0x00, 0xFF)

 random_file :: String - Int - IO Int - IO ()
 random_file f n rnd = do
putStrLn $ Save:  ++ f ++  [ ++ show n ++  bytes]
h - openFile f WriteMode
hSetBinaryMode h True
mapM_ (\ _ - rnd = hPutChar h . toEnum) [1..n]
hClose h

 file_batch :: String - IO Int - IO ()
 file_batch f rnd =
mapM_
  (\ k -
mapM_
  (\ n -
random_file
  (f ++ - ++ show k ++ x- ++ [n])
  (10 * 1024 * 1024 * k)
  rnd
  )
  ABCD
  )
  [1..4]

 If main calls random_file directly, the program seems to work OK, so the
 problem seems to be file_batch. Maybe.

Or, one could say, the problem is the export list :)
If you remove the export list, so that random_file is exported, the leak 
disappears (at least with 7.0.1, didn't test 6.12).
If nothing but main is exported, GHC can be much more aggressive with 
inlining, and it is.

The result is that the list

[1 .. 10*1024*1024*k]

from the penultimate line of random_file is shared between the four 
iterations of the inner loop in file_batch (for k = 1 .. 4). Oops.

If random_file is exported, it is too big to inline it (at least if you 
don't specifically ask for it), so you get no sharing (even better, GHC 
rewrites

mapM_ (\ _ - rnd = hPutChar h . toEnum) [1..n]

to a nice loop, the list isn't constructed at all).

 I don't really know. I had a go
 at playing with -ddump-simpl, but that just generated a 8 KB file which
 is utterly incomprehensible. (Well, the -O0 variant is just about
 comprehensible. The -O2 variant isn't.

You have to look for interesting stuff (in this case the list [1 .. n]) and 
note its identifier (yes, coping with the identifiers in core is hard, 
especially when they are entirely compiler-generated and don't start with a 
source-code name), then see how it is used.

 But it appears that *everything*
 gets inlined into main...)

That's kind of the point of

module Main (main) where

Sometimes that's good, other times not.


 If anybody can figure out what's happening here, I'd be interested to
 know.

Due to the extensive inlining, GHC sees that some values are reused, so it 
decides to share those values instead of recomputing them.
Unfortunately, those values are long lists.
Making GHC look at smaller chunks of the code prevents that, as does 
turning off full-laziness (in both cases the let-binding of the list 
doesn't get floated out of random_file, that floating [more precisely, the 
resulting sharing] is what causes the leak).

Which makes me wonder: unwanted sharing of lists [1 .. n] or similar is a 
frequent cause of space leaks, so would it be possible to teach GHC to not 
share such lists (unless they're bound to a name to indicate sharing is 
wanted)?
In particular for enumerations [a .. b] of type [Int], [Integer] or 
similar, I'm pretty sure that the cost of recomputation is far outweighed 
by the memory consumption of sharing in almost all cases.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-15 Thread Claude Heiland-Allen

On 15/02/11 20:35, Daniel Fischer wrote:

Which makes me wonder: unwanted sharing of lists [1 .. n] or similar is a
frequent cause of space leaks, so would it be possible to teach GHC to not
share such lists (unless they're bound to a name to indicate sharing is
wanted)?



In particular for enumerations [a .. b] of type [Int], [Integer] or
similar, I'm pretty sure that the cost of recomputation is far outweighed
by the memory consumption of sharing in almost all cases.


Compare with the heap profile graph output from this short program which 
uses a horrible data-dependency hack to force recomputation:


main = do
  print $ length
[(x,y) | x - [(1 :: Int) .. 1], y - [(1 :: Int) .. 1]]
  print $ length
[(x,y) | x - [(1 :: Int) .. 1], y - [x+1-x .. 1]]

The heap profile graph looks a little like this:





___

(Tested with ghc 6.12.3 -O2 on linux x86_64)


Claude
--
http://claudiusmaximus.goto10.org

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-15 Thread Daniel Fischer
On Tuesday 15 February 2011 22:20:06, Claude Heiland-Allen wrote:

 Compare with the heap profile graph output from this short program which
 uses a horrible data-dependency hack to force recomputation:

 main = do
print $ length
  [(x,y) | x - [(1 :: Int) .. 1], y - [(1 :: Int) .. 1]]
print $ length
  [(x,y) | x - [(1 :: Int) .. 1], y - [x+1-x .. 1]]

 The heap profile graph looks a little like this:

 
 
 
 
 ___

 (Tested with ghc 6.12.3 -O2 on linux x86_64)


Yup, confirmed with 6.12.3 and 7.0.1 on x86 linux (again behaves 
differently with -fno-full-laziness).
Not only does the second use less memory, it is also faster (something 
around 10%).
Thanks for the nice example.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-15 Thread Andrew Coppin

On 15/02/2011 08:35 PM, Daniel Fischer wrote:


The result is that the list

[1 .. 10*1024*1024*k]

from the penultimate line of random_file is shared between the four
iterations of the inner loop in file_batch (for k = 1 .. 4). Oops.


Ouch! That's gotta sting in the morning... o_O

I suppose what we could really do with is a combinator that runs a 
monadic action N times, without actually constructing a list N elements 
long in order to do so. Then it becomes blatently obvious that there's 
nothing to share, and the problem goes away.


They say that in Haskell, a list *is* a loop. Apparently, not always. ;-)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal [code]

2011-02-15 Thread Daniel Fischer
On Tuesday 15 February 2011 23:29:39, Andrew Coppin wrote:
 On 15/02/2011 08:35 PM, Daniel Fischer wrote:
  The result is that the list
 
  [1 .. 10*1024*1024*k]
 
  from the penultimate line of random_file is shared between the four
  iterations of the inner loop in file_batch (for k = 1 .. 4). Oops.

 Ouch! That's gotta sting in the morning... o_O

I think it has already stung, or we wouldn't have this thread, would we?


 I suppose what we could really do with is a combinator that runs a
 monadic action N times, without actually constructing a list N elements
 long in order to do so.

True enough. But I guess nobody¹ bothered yet because there are so many 
possible designs and most of them are trivial to implement in a line or two 
(so the pain of writing them repeatedly isn't bad enough).

¹ Not quite true, there's the monad-loops package on hackage which provides 
a handful of loops. But not the trivial nTimesDo.

 Then it becomes blatently obvious that there's
 nothing to share, and the problem goes away.

 They say that in Haskell, a list *is* a loop. Apparently, not always.

No, not always. Some lists are just, you know, lists.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal

2011-02-14 Thread Andrew Coppin

Is this a known bug? (GHC 6.10.x)


It's known to happen when optimising shares what shouldn't be shared. Try
compiling with -O2 -fno-cse (if that doesn't help, it doesn't necessarily
mean it's not unwanted sharing, though).
And, please, let us see some code to identify the problem.


I tried -O2 -fno-cse. No difference.

I also tried -O2 -fno-full-laziness. BIG DIFFERENCE.

The program now runs in constant space (like with -O0), but it also runs 
about 2x faster than -O0.


I have no idea what these switches do, but clearly one of these 
optimisations is actually pessimal for this particular program.


I still want to try compiling with a newer version of GHC to see what 
difference that makes.


(And yes, if I had the code on this PC, I could post it. It's kinda long 
though... In essence, it just calls randomRIO a bazillion times and 
writes the results into a file using hPutChar.)


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal

2011-02-14 Thread Max Bolingbroke
On 14 February 2011 21:00, Andrew Coppin andrewcop...@btinternet.com wrote:
 Is this a known bug? (GHC 6.10.x)

 It's known to happen when optimising shares what shouldn't be shared. Try
 compiling with -O2 -fno-cse (if that doesn't help, it doesn't necessarily
 mean it's not unwanted sharing, though).
 And, please, let us see some code to identify the problem.

 I tried -O2 -fno-cse. No difference.

 I also tried -O2 -fno-full-laziness. BIG DIFFERENCE.

See also the very old GHC ticket at
http://hackage.haskell.org/trac/ghc/ticket/917

Max

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal

2011-02-13 Thread Maciej Wos
I was battling a similar (the same?) issue recently. The problem might
indeed be caused by excessive sharing. There's a good example in GHC's
trac [1]. Try compiling your code with -O2 and -fno-full-laziness.

There is also an issue with full-laziness and recursive overloaded
functions [2]. Again, compiling with -fno-full-laziness should help.
Alternatively, if you're using ghc-7.0.1, try switching to HEAD.

-- Maciej

[1] http://hackage.haskell.org/trac/ghc/ticket/917
[2] 
http://www.haskell.org/pipermail/glasgow-haskell-users/2011-February/019997.html

On Sat, Feb 12, 2011 at 7:30 PM, Andrew Coppin
andrewcop...@btinternet.com wrote:
 I have a small program that fills a file with random numbers. If I compile
 it without optimisation, it runs in constant space. And yet, if I supply -O2
 (or even just -O1), for large output files the program gobbles large amounts
 of RAM.

 Is this a known bug? (GHC 6.10.x)

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Sub-optimal

2011-02-12 Thread Andrew Coppin
I have a small program that fills a file with random numbers. If I 
compile it without optimisation, it runs in constant space. And yet, if 
I supply -O2 (or even just -O1), for large output files the program 
gobbles large amounts of RAM.


Is this a known bug? (GHC 6.10.x)

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub-optimal

2011-02-12 Thread Daniel Fischer
On Saturday 12 February 2011 11:30:26, Andrew Coppin wrote:
 I have a small program that fills a file with random numbers. If I
 compile it without optimisation, it runs in constant space. And yet, if
 I supply -O2 (or even just -O1), for large output files the program
 gobbles large amounts of RAM.

 Is this a known bug? (GHC 6.10.x)

It's known to happen when optimising shares what shouldn't be shared. Try 
compiling with -O2 -fno-cse (if that doesn't help, it doesn't necessarily 
mean it's not unwanted sharing, though).
And, please, let us see some code to identify the problem.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe