Re: suboptimal ghc code generation in IO vs equivalent pure code case

2016-05-14 Thread Harendra Kumar
I have stared at the cmm and assembly quite a bit. Indeed there is no trace of a token in cmm and assembly as expected. Here is what is happening. In the IO case the entire original list is evaluated and unfolded on the stack first. During the recursion, the stack will have as many closure

Re: suboptimal ghc code generation in IO vs equivalent pure code case

2016-05-14 Thread David Feuer
The state token is zero-width and should therefore be erased altogether in code generation. On May 14, 2016 4:21 PM, "Tyson Whitehead" wrote: > On 14/05/16 02:31 PM, Harendra Kumar wrote: > >> The difference seems to be entirely due to memory pressure. At list size >> 1000

Re: suboptimal ghc code generation in IO vs equivalent pure code case

2016-05-14 Thread Tyson Whitehead
On 14/05/16 02:31 PM, Harendra Kumar wrote: The difference seems to be entirely due to memory pressure. At list size 1000 both pure version and IO version perform equally. But as the size of the list increases the pure version scales linearly while the IO version degrades exponentially. Here

Re: suboptimal ghc code generation in IO vs equivalent pure code case

2016-05-14 Thread Harendra Kumar
On 15 May 2016 at 01:35, David Feuer wrote: > Well, a few weeks ago Bertram Felgenhauer came up with a version of IO > that acts more like lazy ST. That could be just the thing. He placed it in > the public domain/CC0 and told me I could put it up on Hackage if I want. >

Re: suboptimal ghc code generation in IO vs equivalent pure code case

2016-05-14 Thread David Feuer
Well, a few weeks ago Bertram Felgenhauer came up with a version of IO that acts more like lazy ST. That could be just the thing. He placed it in the public domain/CC0 and told me I could put it up on Hackage if I want. I'll try to do that this week, but no promises. I could forward his email if

Re: suboptimal ghc code generation in IO vs equivalent pure code case

2016-05-14 Thread Harendra Kumar
The difference seems to be entirely due to memory pressure. At list size 1000 both pure version and IO version perform equally. But as the size of the list increases the pure version scales linearly while the IO version degrades exponentially. Here are the execution times per list element in ns as

Re: suboptimal ghc code generation in IO vs equivalent pure code case

2016-05-14 Thread Harendra Kumar
You are right about the way IO code is generated because of the ordering semantics. The IO version builds the list entirely in a recursive fashion before returning it while the pure code builds it lazily. I wrote very simple versions to keep things simpler: Pure version: f [] = [] f (x : xs) = x