Send Beginners mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."
Today's Topics:
1. Re: Stack space overflow with foldl' (Daniel Fischer)
2. Again on list manipulations (Lorenzo Isella)
3. Re: Again on list manipulations (Alex Rozenshteyn)
4. Re: Again on list manipulations (Lorenzo Isella)
5. Re: Again on list manipulations (Daniel Fischer)
6. Re: Stack space overflow with foldl' (Mathew de Detrich)
----------------------------------------------------------------------
Message: 1
Date: Fri, 10 Sep 2010 16:20:40 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Stack space overflow with foldl'
To: [email protected]
Cc: Ryan Prichard <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"
On Friday 10 September 2010 11:29:56, Ryan Prichard wrote:
> Hi,
>
> I see a stack overflow with this code, but I don't understand why.
Neither do I, really, but ghc is being too clever for its own good - or not
clever enough.
>
> I looked at the Core output with ghc-core, with or without
> optimizations, and I see a $wlgo recursive function that doesn't appear
> to end in a tail call.
Without optimisations, I see a nice tail-recursive lgo inside foldl'2,
pretty much what one would like to see.
With optimisations, you get a specialised worker $wlgo:
Rec {
Main.$wlgo :: [GHC.Types.Int] -> (##)
GblId
[Arity 1
NoCafRefs
Str: DmdType S]
Main.$wlgo =
\ (w_sms :: [GHC.Types.Int]) ->
case case w_sms of _ {
[] -> GHC.Unit.();
: x_adq xs_adr ->
case x_adq of _ { GHC.Types.I# _ ->
case Main.$wlgo xs_adr of _ { (# #) -> GHC.Unit.() }
}
}
of _ { () ->
GHC.Prim.(##)
}
end Rec }
and it's here that ghc shows the wrong amount of cleverness.
What have we? At the types () and [Int], with f = flip seq, the step in lgo
unfolds to
lgo z (x:xs)
~> let z' = f z x in lgo z' xs
~> case f z x of
z'@() -> lgo z' xs
~> case (case x of { I# _ -> z }) of
z'@() -> lgo z' xs
Now flip seq returns its first argument unless its second argument is _|_
and () has only one non-bottom value, so the ()-argument of lgo can be
eliminated (here, the initial ()-argument is known to be (), in general the
wrapper checks for _|_ before entering the worker), giving
wlgo :: [Int] -> ()
wlgo [] = ()
wlgo (x:xs) =
case (case x of { I# _ -> () }) of
() -> wlgo xs
It would be nice if the compiler rewrote the last equation to
wlgo (x:xs) -> case x of { I# _ -> wlgo xs }
, but apparently it can't. Still, that's pretty good code.
Now comes the misplaced cleverness.
Working with unboxed types is better (faster) than working with boxed
types, so let's use unboxed types, giving $wlgo the type
[Int] -> (##)
(unboxed 0-tuple). But it wasn't clever enough to directly return (# #) in
the case of an empty list - that would've allowed the core to be
\ (w_sms :: [GHC.Types.Int]) ->
case w_sms of _ {
[] -> GHC.Prim.(##)
: x_adq xs_adr ->
case x_adq of _ { GHC.Types.I# _ ->
Main.$wlgo xs_adr }
}
and all would've been fine and dandy. So it chose [] -> GHC.Unit.() and
that forced the second branch to also have that type, hence you can't have
a tail call there but have to box the result of the recursive call (only to
unbox it again).
So you get superfluous boxing, unboxing, reboxing in addition to a stack-
eating recursion.
But you've hit a rare case here, if you use foldl'2 (flip seq) at a type
with more than one non-bottom value, the first argument of lgo is not
eliminated and you don't get the boxing-unboxing dance, instead you get a
nice tail-recursion, even if foldl'2 is used only once and not exported.
Why GHC does that for (), beats me.
> I don't see any let expressions in the
> folding code, so I assume no thunks are being created. I can make a
> tail call appear by doing either of two things:
>
> 1. Replace "lgo z []" with "lgo !z []". This suggestion came from an
> email on haskell-beginners that I can't find right now. It was a few
> months ago.
Perhaps http://www.haskell.org/pipermail/beginners/2010-August/005016.html
?
>
> 2. Instead of using the __GLASGOW_HASKELL__ version of foldl', use the
> other version:
>
> foldl' f a [] = a
> foldl' f a (x:xs) = let a' = f a x in a' `seq` foldl' f a' xs
But that needs to pass also the function, which is generally slower than
having it as a static argument.
3. {-# NOINLINE foldl'2 #-}
But that's not so good for performance in general.
Option 1 gives the best Core, but it changes behaviour, with that,
foldl' (\_ _ -> 1) undefined [0] = 1
foldl'2 (\_ _ -> 1) undefined [0] = _|_
To retain the behaviour of foldl',
foldl'2 f z0 xs0 =
case xs0 of
[] -> z0
(x:xs) -> lgo (f z0 x) xs
where
lgo !z [] = z
lgo z (y:ys) = lgo (f z y) ys
>
> My test case is contrived. Originally, I had a program that read
> lines from a file as Data.ByteString values, and I was trying to debug
> a stack overflow. I added the foldl' call to force the evaluation of
> the ByteString lines, but the foldl' call itself overflowed the
> stack.
That's probably a different matter, foldl' evaluates the accumulation
parameter only to weak head normal form, if it's not of simple type, it can
still contain thunks that will overflow the stack when demanded.
>
> I might have fixed my original stack overflow problem. I was applying
> sum to a large list of integers, and sum is lazy.
For [Int] and [Integer], sum is specialised, so when compiled with
optimisations, ghc should use a strict version for those types.
> I don't think I have
> any real code anymore that overflows the stack, but I'm uncomfortable
> because I don't know why my test case doesn't work.
>
> Is the foldl' call in my test case allowed to have linear stack usage?
I don't think the language definition treats that, so technically it's
allowed then. But it shouldn't happen.
>
> Thanks,
> -Ryan
------------------------------
Message: 2
Date: Fri, 10 Sep 2010 16:55:43 +0200
From: Lorenzo Isella <[email protected]>
Subject: [Haskell-beginners] Again on list manipulations
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Dear All,
I know this must be a one-liner, but it am banging my head against the wall.
I am trying my hands at list manipulation in Haskell and a lot of useful
function are making my life easier but I cannot achieve something really
simple.
Let us say you have the lists ml and sel
ml=[23,44,55,8,98]
and
sel=[1,2] .
Now, I would like simply to get a new list whose entries are the
elements of ml in position sel. In my case things might be a little more
complicated because all the elements are Integer and not Int (I noticed
that sometimes this means I have to resort to generic functions).
Cheers
Lorenzo
------------------------------
Message: 3
Date: Fri, 10 Sep 2010 11:03:03 -0400
From: Alex Rozenshteyn <[email protected]>
Subject: Re: [Haskell-beginners] Again on list manipulations
To: Lorenzo Isella <[email protected]>
Cc: [email protected]
Message-ID:
<[email protected]>
Content-Type: text/plain; charset="utf-8"
well,
> selFn = map (\x -> (! x)) sel -- This is not necessarily the most elegant
way to word it
would return a list of functions, which, when applied to a list, return an
element from it based on the index
> results = map ($ ml) selFn
would solve your problem
alternatively (and more cleanly),
> results = map (ml !) sel
On Fri, Sep 10, 2010 at 10:55 AM, Lorenzo Isella
<[email protected]>wrote:
> Dear All,
> I know this must be a one-liner, but it am banging my head against the
> wall.
> I am trying my hands at list manipulation in Haskell and a lot of useful
> function are making my life easier but I cannot achieve something really
> simple.
> Let us say you have the lists ml and sel
>
> ml=[23,44,55,8,98]
> and
> sel=[1,2] .
>
> Now, I would like simply to get a new list whose entries are the elements
> of ml in position sel. In my case things might be a little more complicated
> because all the elements are Integer and not Int (I noticed that sometimes
> this means I have to resort to generic functions).
> Cheers
>
> Lorenzo
> _______________________________________________
> Beginners mailing list
> [email protected]
> http://www.haskell.org/mailman/listinfo/beginners
>
--
Alex R
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.haskell.org/pipermail/beginners/attachments/20100910/415f0997/attachment-0001.html
------------------------------
Message: 4
Date: Fri, 10 Sep 2010 17:12:27 +0200
From: Lorenzo Isella <[email protected]>
Subject: Re: [Haskell-beginners] Again on list manipulations
To: Alex Rozenshteyn <[email protected]>
Cc: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8; format=flowed
On 09/10/2010 05:03 PM, Alex Rozenshteyn wrote:
> alternatively (and more cleanly),
> > results = map (ml !) sel
Thanks Alex. I am left with this problem only: the entries of sel are
(in my code) Integer rather than Int so...I get an error message when
compiling sine "!!" (I believe you left out a "!") expects an Int
instead of an Integer.
Any idea about how to fix this without converting sel from Integer to Int?
Many thanks
Lorenzo
------------------------------
Message: 5
Date: Fri, 10 Sep 2010 17:15:05 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] Again on list manipulations
To: [email protected]
Cc: Lorenzo Isella <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"
On Friday 10 September 2010 16:55:43, Lorenzo Isella wrote:
> Dear All,
> I know this must be a one-liner, but it am banging my head against the
> wall. I am trying my hands at list manipulation in Haskell and a lot of
> useful function are making my life easier but I cannot achieve something
> really simple.
> Let us say you have the lists ml and sel
>
> ml=[23,44,55,8,98]
> and
> sel=[1,2] .
>
> Now, I would like simply to get a new list whose entries are the
> elements of ml in position sel.
Simple but potentially very inefficient:
map (ml !!) sel
resp.
map (genericIndex ml) sel
As long as all elements of sel are small, that's okay, but imagine
sel = [k*10^6 | k <- [1 .. 10]]
or worse,
sel = [10^6 + k | k <- [0, 3 .. 1000]]
If sel is increasing,
selection :: Integral a => [a] -> [b] -> [b]
selection sel = pick distances
where
distances = zipWith (-) sel (0:sel)
pick [] _ = []
pick (d:ds) xs = case genericDrop d xs of
[] -> []
ys@(y:_) -> y : pick ds ys
*Select> selection [1,1,7,9,12,12,20] [0 .. 16]
[1,1,7,9,12,12]
*Select> selection [1,1,7,9,12,12,20] [0 .. 1000]
[1,1,7,9,12,12,20]
If sel is not increasing, an efficient solution is more complex.
> In my case things might be a little more
> complicated because all the elements are Integer and not Int (I noticed
> that sometimes this means I have to resort to generic functions).
> Cheers
>
> Lorenzo
------------------------------
Message: 6
Date: Sat, 11 Sep 2010 01:52:59 +1000
From: Mathew de Detrich <[email protected]>
Subject: Re: [Haskell-beginners] Stack space overflow with foldl'
To: Daniel Fischer <[email protected]>
Cc: [email protected], Ryan Prichard <[email protected]>
Message-ID:
<[email protected]>
Content-Type: text/plain; charset="iso-8859-1"
Shouldn't this be reported as a bug?
On 11/09/2010 12:22 AM, "Daniel Fischer" <[email protected]> wrote:
On Friday 10 September 2010 11:29:56, Ryan Prichard wrote:
> Hi,
>
> I see a stack overflow with thi...
Neither do I, really, but ghc is being too clever for its own good - or not
clever enough.
>
> I looked at the Core output with ghc-core, with or without
> optimizations, and I see a $wlgo r...
Without optimisations, I see a nice tail-recursive lgo inside foldl'2,
pretty much what one would like to see.
With optimisations, you get a specialised worker $wlgo:
Rec {
Main.$wlgo :: [GHC.Types.Int] -> (##)
GblId
[Arity 1
NoCafRefs
Str: DmdType S]
Main.$wlgo =
\ (w_sms :: [GHC.Types.Int]) ->
case case w_sms of _ {
[] -> GHC.Unit.();
: x_adq xs_adr ->
case x_adq of _ { GHC.Types.I# _ ->
case Main.$wlgo xs_adr of _ { (# #) -> GHC.Unit.() }
}
}
of _ { () ->
GHC.Prim.(##)
}
end Rec }
and it's here that ghc shows the wrong amount of cleverness.
What have we? At the types () and [Int], with f = flip seq, the step in lgo
unfolds to
lgo z (x:xs)
~> let z' = f z x in lgo z' xs
~> case f z x of
z'@() -> lgo z' xs
~> case (case x of { I# _ -> z }) of
z'@() -> lgo z' xs
Now flip seq returns its first argument unless its second argument is _|_
and () has only one non-bottom value, so the ()-argument of lgo can be
eliminated (here, the initial ()-argument is known to be (), in general the
wrapper checks for _|_ before entering the worker), giving
wlgo :: [Int] -> ()
wlgo [] = ()
wlgo (x:xs) =
case (case x of { I# _ -> () }) of
() -> wlgo xs
It would be nice if the compiler rewrote the last equation to
wlgo (x:xs) -> case x of { I# _ -> wlgo xs }
, but apparently it can't. Still, that's pretty good code.
Now comes the misplaced cleverness.
Working with unboxed types is better (faster) than working with boxed
types, so let's use unboxed types, giving $wlgo the type
[Int] -> (##)
(unboxed 0-tuple). But it wasn't clever enough to directly return (# #) in
the case of an empty list - that would've allowed the core to be
\ (w_sms :: [GHC.Types.Int]) ->
case w_sms of _ {
[] -> GHC.Prim.(##)
: x_adq xs_adr ->
case x_adq of _ { GHC.Types.I# _ ->
Main.$wlgo xs_adr }
}
and all would've been fine and dandy. So it chose [] -> GHC.Unit.() and
that forced the second branch to also have that type, hence you can't have
a tail call there but have to box the result of the recursive call (only to
unbox it again).
So you get superfluous boxing, unboxing, reboxing in addition to a stack-
eating recursion.
But you've hit a rare case here, if you use foldl'2 (flip seq) at a type
with more than one non-bottom value, the first argument of lgo is not
eliminated and you don't get the boxing-unboxing dance, instead you get a
nice tail-recursion, even if foldl'2 is used only once and not exported.
Why GHC does that for (), beats me.
> I don't see any let expressions in the
> folding code, so I assume no thunks are being created. ...
Perhaps http://www.haskell.org/pipermail/beginners/2010-August/005016.html
?
>
> 2. Instead of using the __GLASGOW_HASKELL__ version of foldl', use the
> other version:
>
> f...
But that needs to pass also the function, which is generally slower than
having it as a static argument.
3. {-# NOINLINE foldl'2 #-}
But that's not so good for performance in general.
Option 1 gives the best Core, but it changes behaviour, with that,
foldl' (\_ _ -> 1) undefined [0] = 1
foldl'2 (\_ _ -> 1) undefined [0] = _|_
To retain the behaviour of foldl',
foldl'2 f z0 xs0 =
case xs0 of
[] -> z0
(x:xs) -> lgo (f z0 x) xs
where
lgo !z [] = z
lgo z (y:ys) = lgo (f z y) ys
>
> My test case is contrived. Originally, I had a program that read
> lines from a file as Data.B...
That's probably a different matter, foldl' evaluates the accumulation
parameter only to weak head normal form, if it's not of simple type, it can
still contain thunks that will overflow the stack when demanded.
>
> I might have fixed my original stack overflow problem. I was applying
> sum to a large list of...
For [Int] and [Integer], sum is specialised, so when compiled with
optimisations, ghc should use a strict version for those types.
> I don't think I have
> any real code anymore that overflows the stack, but I'm uncomfortable
> be...
I don't think the language definition treats that, so technically it's
allowed then. But it shouldn't happen.
>
> Thanks,
> -Ryan
_______________________________________________
Beginners mailing list
Beginne...
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.haskell.org/pipermail/beginners/attachments/20100910/feb44ff3/attachment.html
------------------------------
_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners
End of Beginners Digest, Vol 27, Issue 23
*****************************************