Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-24 Thread C K Kashyap
Thanks for the pointer Mukesh  I'll go over the blog.

Changing the xml parser to another one from hackage - xml - helped but not
fully. I think I would need to change to bytestring. But for now, I split
the program into smaller programs and it seems to work.

Regards,
Kashyap


On Sat, Mar 23, 2013 at 11:55 AM, mukesh tiwari 
mukeshtiwari.ii...@gmail.com wrote:

 Hi Kashyap
 I am not sure if this solution to your problem but try using Bytestring
 rather than String in

 parseXML' :: String - XMLAST
 parseXML' str =
   f ast where
   ast = parse (spaces  xmlParser)  str
   f (Right x) = x

   f (Left x) = CouldNotParse


 Also see this post[1] My Space is Leaking..

 Regards,
 Mukesh Tiwari

 [1] http://www.mega-nerd.com/erikd/Blog/


 On Sat, Mar 23, 2013 at 11:11 AM, C K Kashyap ckkash...@gmail.com wrote:

 Oops...I sent out the earlier message accidentally.

 I got some profiling done and got this pdf generated. I see unhealthy
 growths in my XML parser.
 https://github.com/ckkashyap/haskell-perf-repro/blob/master/RSXP.hs
 I must be not using parsec efficiently.

 Regards,
 Kashyap




 On Sat, Mar 23, 2013 at 11:07 AM, C K Kashyap ckkash...@gmail.comwrote:

 I got some profiling done and got this pdf generated. I see unhealthy
 growths in my XML parser.



 On Fri, Mar 22, 2013 at 8:12 PM, C K Kashyap ckkash...@gmail.comwrote:

 Hi folks,

 I've run into more issues with my report generation tool  I'd
 really appreciate some help.

 I've created a repro project on github to demonstrate the problem.
 git://github.com/ckkashyap/haskell-perf-repro.git

 There is a template xml file that needs to be replicated several times
 (3000 or so) under the data directory and then driver needs to be run.
 The memory used by driver keeps growing until it runs out of memory.

 Also, I'd appreciate some tips on how to go about debugging this
 situation. I am on the windows platform.


 Regards,
 Kashyap


 On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh k...@atamo.com wrote:

 On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
 to.darkan...@gmail.com wrote:
  Yes. You (and Dan) are totally right. 'Let' just bind expression, not
  evaluating it. Dan's evaluate trick force rnf to run before hClose.
 As I
  said - it's tricky part especially for newbie like me :)

 To place this in perspective, one only needs to descend one or two
 more layers before the semantics starts confusing even experts.

 Whereas the difference between seq and evaluate shouldn't be too hard
 to grasp, that between evaluate and (return $!) is considerably more
 subtle, as Edward Yang notified us 10 days ago. See the thread titled
 To seq or not to seq.

 -- Kim-Ee

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe





 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-23 Thread mukesh tiwari
Hi Kashyap
I am not sure if this solution to your problem but try using Bytestring
rather than String in

parseXML' :: String - XMLAST
parseXML' str =
  f ast where
  ast = parse (spaces  xmlParser)  str
  f (Right x) = x
  f (Left x) = CouldNotParse


Also see this post[1] My Space is Leaking..

Regards,
Mukesh Tiwari

[1] http://www.mega-nerd.com/erikd/Blog/


On Sat, Mar 23, 2013 at 11:11 AM, C K Kashyap ckkash...@gmail.com wrote:

 Oops...I sent out the earlier message accidentally.

 I got some profiling done and got this pdf generated. I see unhealthy
 growths in my XML parser.
 https://github.com/ckkashyap/haskell-perf-repro/blob/master/RSXP.hs
 I must be not using parsec efficiently.

 Regards,
 Kashyap




 On Sat, Mar 23, 2013 at 11:07 AM, C K Kashyap ckkash...@gmail.com wrote:

 I got some profiling done and got this pdf generated. I see unhealthy
 growths in my XML parser.



 On Fri, Mar 22, 2013 at 8:12 PM, C K Kashyap ckkash...@gmail.com wrote:

 Hi folks,

 I've run into more issues with my report generation tool  I'd really
 appreciate some help.

 I've created a repro project on github to demonstrate the problem.
 git://github.com/ckkashyap/haskell-perf-repro.git

 There is a template xml file that needs to be replicated several times
 (3000 or so) under the data directory and then driver needs to be run.
 The memory used by driver keeps growing until it runs out of memory.

 Also, I'd appreciate some tips on how to go about debugging this
 situation. I am on the windows platform.


 Regards,
 Kashyap


 On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh k...@atamo.com wrote:

 On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
 to.darkan...@gmail.com wrote:
  Yes. You (and Dan) are totally right. 'Let' just bind expression, not
  evaluating it. Dan's evaluate trick force rnf to run before hClose.
 As I
  said - it's tricky part especially for newbie like me :)

 To place this in perspective, one only needs to descend one or two
 more layers before the semantics starts confusing even experts.

 Whereas the difference between seq and evaluate shouldn't be too hard
 to grasp, that between evaluate and (return $!) is considerably more
 subtle, as Edward Yang notified us 10 days ago. See the thread titled
 To seq or not to seq.

 -- Kim-Ee

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe





 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-22 Thread C K Kashyap
Hi folks,

I've run into more issues with my report generation tool  I'd really
appreciate some help.

I've created a repro project on github to demonstrate the problem.
git://github.com/ckkashyap/haskell-perf-repro.git

There is a template xml file that needs to be replicated several times
(3000 or so) under the data directory and then driver needs to be run.
The memory used by driver keeps growing until it runs out of memory.

Also, I'd appreciate some tips on how to go about debugging this situation.
I am on the windows platform.


Regards,
Kashyap


On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh k...@atamo.com wrote:

 On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
 to.darkan...@gmail.com wrote:
  Yes. You (and Dan) are totally right. 'Let' just bind expression, not
  evaluating it. Dan's evaluate trick force rnf to run before hClose. As I
  said - it's tricky part especially for newbie like me :)

 To place this in perspective, one only needs to descend one or two
 more layers before the semantics starts confusing even experts.

 Whereas the difference between seq and evaluate shouldn't be too hard
 to grasp, that between evaluate and (return $!) is considerably more
 subtle, as Edward Yang notified us 10 days ago. See the thread titled
 To seq or not to seq.

 -- Kim-Ee

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-22 Thread C K Kashyap
I got some profiling done and got this pdf generated. I see unhealthy
growths in my XML parser.



On Fri, Mar 22, 2013 at 8:12 PM, C K Kashyap ckkash...@gmail.com wrote:

 Hi folks,

 I've run into more issues with my report generation tool  I'd really
 appreciate some help.

 I've created a repro project on github to demonstrate the problem.
 git://github.com/ckkashyap/haskell-perf-repro.git

 There is a template xml file that needs to be replicated several times
 (3000 or so) under the data directory and then driver needs to be run.
 The memory used by driver keeps growing until it runs out of memory.

 Also, I'd appreciate some tips on how to go about debugging this
 situation. I am on the windows platform.


 Regards,
 Kashyap


 On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh k...@atamo.com wrote:

 On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
 to.darkan...@gmail.com wrote:
  Yes. You (and Dan) are totally right. 'Let' just bind expression, not
  evaluating it. Dan's evaluate trick force rnf to run before hClose. As I
  said - it's tricky part especially for newbie like me :)

 To place this in perspective, one only needs to descend one or two
 more layers before the semantics starts confusing even experts.

 Whereas the difference between seq and evaluate shouldn't be too hard
 to grasp, that between evaluate and (return $!) is considerably more
 subtle, as Edward Yang notified us 10 days ago. See the thread titled
 To seq or not to seq.

 -- Kim-Ee

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-22 Thread C K Kashyap
Oops...I sent out the earlier message accidentally.

I got some profiling done and got this pdf generated. I see unhealthy
growths in my XML parser.
https://github.com/ckkashyap/haskell-perf-repro/blob/master/RSXP.hs
I must be not using parsec efficiently.

Regards,
Kashyap




On Sat, Mar 23, 2013 at 11:07 AM, C K Kashyap ckkash...@gmail.com wrote:

 I got some profiling done and got this pdf generated. I see unhealthy
 growths in my XML parser.



 On Fri, Mar 22, 2013 at 8:12 PM, C K Kashyap ckkash...@gmail.com wrote:

 Hi folks,

 I've run into more issues with my report generation tool  I'd really
 appreciate some help.

 I've created a repro project on github to demonstrate the problem.
 git://github.com/ckkashyap/haskell-perf-repro.git

 There is a template xml file that needs to be replicated several times
 (3000 or so) under the data directory and then driver needs to be run.
 The memory used by driver keeps growing until it runs out of memory.

 Also, I'd appreciate some tips on how to go about debugging this
 situation. I am on the windows platform.


 Regards,
 Kashyap


 On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh k...@atamo.com wrote:

 On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
 to.darkan...@gmail.com wrote:
  Yes. You (and Dan) are totally right. 'Let' just bind expression, not
  evaluating it. Dan's evaluate trick force rnf to run before hClose. As
 I
  said - it's tricky part especially for newbie like me :)

 To place this in perspective, one only needs to descend one or two
 more layers before the semantics starts confusing even experts.

 Whereas the difference between seq and evaluate shouldn't be too hard
 to grasp, that between evaluate and (return $!) is considerably more
 subtle, as Edward Yang notified us 10 days ago. See the thread titled
 To seq or not to seq.

 -- Kim-Ee

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe






driver.pdf
Description: Adobe PDF document
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-19 Thread Konstantin Litvinenko

On 03/19/2013 07:12 AM, Edward Kmett wrote:

Konstantin,

Please allow me to elaborate on Dan's point -- or at least the point
that I believe that Dan is making.

Using,

let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str


or ($!!)will create a value that *when forced* cause the rnfto occur.

As you don't look at buguntil much later this causes the same problem as
before!



Yes. You (and Dan) are totally right. 'Let' just bind expression, not 
evaluating it. Dan's evaluate trick force rnf to run before hClose. As I 
said - it's tricky part especially for newbie like me :)




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-19 Thread Kim-Ee Yeoh
On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
to.darkan...@gmail.com wrote:
 Yes. You (and Dan) are totally right. 'Let' just bind expression, not
 evaluating it. Dan's evaluate trick force rnf to run before hClose. As I
 said - it's tricky part especially for newbie like me :)

To place this in perspective, one only needs to descend one or two
more layers before the semantics starts confusing even experts.

Whereas the difference between seq and evaluate shouldn't be too hard
to grasp, that between evaluate and (return $!) is considerably more
subtle, as Edward Yang notified us 10 days ago. See the thread titled
To seq or not to seq.

-- Kim-Ee

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Konstantin Litvinenko

On 03/17/2013 07:08 AM, C K Kashyap wrote:

I am working on an automation that periodically fetches bug data from
our bug tracking system and creates static HTML reports. Things worked
fine when the bugs were in the order of 200 or so. Now I am trying to
run it against 3000 bugs and suddenly I see things like - too  many open
handles, out of memory etc ...

Here's the code snippet - http://hpaste.org/84197

It's a small snippet and I've put in the comments stating how I run into
out of file handles or simply file not getting read due to lazy IO.

I realize that putting ($!) using a trial/error approach is going to be
futile. I'd appreciate some pointers into the tools I could use to get
some idea of which expressions are building up huge thunks.


You problem is in

let bug = ($!) fileContents2Bug str

($!) evaluate only WHNF and you need NF. Above just evaluate to first 
char in a file, not to all content. To fully evaluate 'str' you need 
something like


let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Ivan Lazar Miljenovic
On 18 March 2013 21:01, Konstantin Litvinenko to.darkan...@gmail.com wrote:
 On 03/17/2013 07:08 AM, C K Kashyap wrote:

 I am working on an automation that periodically fetches bug data from
 our bug tracking system and creates static HTML reports. Things worked
 fine when the bugs were in the order of 200 or so. Now I am trying to
 run it against 3000 bugs and suddenly I see things like - too  many open
 handles, out of memory etc ...

 Here's the code snippet - http://hpaste.org/84197

 It's a small snippet and I've put in the comments stating how I run into
 out of file handles or simply file not getting read due to lazy IO.

 I realize that putting ($!) using a trial/error approach is going to be
 futile. I'd appreciate some pointers into the tools I could use to get
 some idea of which expressions are building up huge thunks.


 You problem is in

 let bug = ($!) fileContents2Bug str

 ($!) evaluate only WHNF and you need NF. Above just evaluate to first char
 in a file, not to all content. To fully evaluate 'str' you need something
 like

 let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str

Or use $!! from Control.DeepSeq.







 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
http://IvanMiljenovic.wordpress.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread C K Kashyap
Thanks Konstantin ... I'll try that out too...



Regards,
Kashyap


On Mon, Mar 18, 2013 at 3:31 PM, Konstantin Litvinenko 
to.darkan...@gmail.com wrote:

 On 03/17/2013 07:08 AM, C K Kashyap wrote:

 I am working on an automation that periodically fetches bug data from
 our bug tracking system and creates static HTML reports. Things worked
 fine when the bugs were in the order of 200 or so. Now I am trying to
 run it against 3000 bugs and suddenly I see things like - too  many open
 handles, out of memory etc ...

 Here's the code snippet - http://hpaste.org/84197

 It's a small snippet and I've put in the comments stating how I run into
 out of file handles or simply file not getting read due to lazy IO.

 I realize that putting ($!) using a trial/error approach is going to be
 futile. I'd appreciate some pointers into the tools I could use to get
 some idea of which expressions are building up huge thunks.


 You problem is in

 let bug = ($!) fileContents2Bug str

 ($!) evaluate only WHNF and you need NF. Above just evaluate to first char
 in a file, not to all content. To fully evaluate 'str' you need something
 like

 let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str






 __**_
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Dan Doel
Do note that deepSeq alone won't (I think) change anything in your
current code. bug will deepSeq the file contents. And the cons will
seq bug. But nothing is evaluating the cons. And further, the cons
isn't seqing the tail, so none of that will collapse, either. So the
file descriptors will still all be opened at once.

Probably the best solution if you choose to go this way is:

bug - evaluate (fileContents2Bug $!! str)

which ties the evaluation of the file contents into the IO execution.
At that point, deepSeqing the file is probably unnecessary, though,
because evaluating the bug will likely allow the file contents to be
collected.

On Mon, Mar 18, 2013 at 6:42 AM, C K Kashyap ckkash...@gmail.com wrote:
 Thanks Konstantin ... I'll try that out too...



 Regards,
 Kashyap


 On Mon, Mar 18, 2013 at 3:31 PM, Konstantin Litvinenko
 to.darkan...@gmail.com wrote:

 On 03/17/2013 07:08 AM, C K Kashyap wrote:

 I am working on an automation that periodically fetches bug data from
 our bug tracking system and creates static HTML reports. Things worked
 fine when the bugs were in the order of 200 or so. Now I am trying to
 run it against 3000 bugs and suddenly I see things like - too  many open
 handles, out of memory etc ...

 Here's the code snippet - http://hpaste.org/84197

 It's a small snippet and I've put in the comments stating how I run into
 out of file handles or simply file not getting read due to lazy IO.

 I realize that putting ($!) using a trial/error approach is going to be
 futile. I'd appreciate some pointers into the tools I could use to get
 some idea of which expressions are building up huge thunks.


 You problem is in

 let bug = ($!) fileContents2Bug str

 ($!) evaluate only WHNF and you need NF. Above just evaluate to first char
 in a file, not to all content. To fully evaluate 'str' you need something
 like

 let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str






 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Konstantin Litvinenko

On 03/18/2013 06:06 PM, Dan Doel wrote:

Do note that deepSeq alone won't (I think) change anything in your
current code. bug will deepSeq the file contents.


rfn fully evaluate 'bug' by reading all file content. Later hClose will 
close it and we done. Not reading all content will lead to semi closed 
handle, leaked in that case. Handle will be opened until hGetContents 
lazy list hit the end.


 And the cons will

seq bug. But nothing is evaluating the cons. And further, the cons
isn't seqing the tail, so none of that will collapse, either. So the
file descriptors will still all be opened at once.

Probably the best solution if you choose to go this way is:

 bug - evaluate (fileContents2Bug $!! str)

which ties the evaluation of the file contents into the IO execution.
At that point, deepSeqing the file is probably unnecessary, though,
because evaluating the bug will likely allow the file contents to be
collected.


evaluate do the same as $! - evaluate args to WHNF. That won't help in 
any way. Executing in IO monad doesn't imply strictness Thats why mixing 
lazy hGetContent with strict hOpen/hClose is so tricky.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Edward Kmett
Konstantin,

Please allow me to elaborate on Dan's point -- or at least the point that I
believe that Dan is making.

Using,

let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str


or ($!!) will create a value that *when forced* cause the rnf to occur.

As you don't look at bug until much later this causes the same problem as
before!

His addition of evaluate forces the rnf to happen before proceeding.

On a more ad hoc basis you might say

let !bug = fileContents2Bug $!! str

but without the bang-pattern or the evaluate, which is arguably strictly
better (er no pun intended) from a semantics perspective nothing has
happened yet until someone inspects bug.

With the code as structured this doesn't happen until it is too late.

-Edward

On Mon, Mar 18, 2013 at 1:11 PM, Konstantin Litvinenko 
to.darkan...@gmail.com wrote:

 On 03/18/2013 06:06 PM, Dan Doel wrote:

 Do note that deepSeq alone won't (I think) change anything in your
 current code. bug will deepSeq the file contents.


 rfn fully evaluate 'bug' by reading all file content. Later hClose will
 close it and we done. Not reading all content will lead to semi closed
 handle, leaked in that case. Handle will be opened until hGetContents lazy
 list hit the end.


  And the cons will

 seq bug. But nothing is evaluating the cons. And further, the cons
 isn't seqing the tail, so none of that will collapse, either. So the
 file descriptors will still all be opened at once.

 Probably the best solution if you choose to go this way is:

  bug - evaluate (fileContents2Bug $!! str)

 which ties the evaluation of the file contents into the IO execution.
 At that point, deepSeqing the file is probably unnecessary, though,
 because evaluating the bug will likely allow the file contents to be
 collected.


 evaluate do the same as $! - evaluate args to WHNF. That won't help in any
 way. Executing in IO monad doesn't imply strictness Thats why mixing lazy
 hGetContent with strict hOpen/hClose is so tricky.




 __**_
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/**mailman/listinfo/haskell-cafehttp://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-17 Thread Carlo Hamalainen
On Sun, Mar 17, 2013 at 3:08 PM, C K Kashyap ckkash...@gmail.com wrote:

 It's a small snippet and I've put in the comments stating how I run into
 out of file handles or simply file not getting read due to lazy IO.

 I realize that putting ($!) using a trial/error approach is going to be
 futile. I'd appreciate some pointers into the tools I could use to get some
 idea of which expressions are building up huge thunks.


Have you tried System.IO.Strict's readFile? I had similar problems (too
many file handles) and fixed it with

import qualified System.IO.Strict as S

and then using S.readFile instead of the standard prelude's readFile.

This is where I used the strict IO readFile in my toy project:
https://github.com/carlohamalainen/checker/blob/master/Checker.hs

-- 
Carlo Hamalainen
http://carlo-hamalainen.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-17 Thread Dan Doel
One thing that typically isn't mentioned in these situations is that
you can add more laziness. I'm unsure if it would work from just your
snippet, but it might.

The core problem is that something like:

mapM readFile names

will open all the files at once. Applying any processing to the file
contents is irrelevant unless the results of that processing is
evaluated sufficiently to allow the file to be closed.

Now, most people will tell you that this means lazy I/O is evil, and
you should make it all strict. But, consider an analogous situation
where instead of opening a file handle, we do something that allocates
a lot of memory, and can only free it after processing. We'd run out
of memory allocating 3,000 * X, but X alone is fine. Then people
usually suggest delaying the allocation until you need it, i.e. lazy
evaluation.

Unfortunately, there's no combinator for this in the standard
libraries, but you can write one:

mapMI :: (a - IO b) - [a] - IO [b]
mapMI _ [] = return []
-- You can play with this case a bit. This will open a file for
the head of the list,
-- and then when each subsequent cons cell is inspected. You could probably
-- interleave 'f x' as well.
mapMI f (x:xs) = do y - f x ; ys - unsafeInterleaveIO (mapMI f
xs) ; return (y:ys)

Now, mapMI readFile only opens the handle when you match on the list,
so if you process the list incrementally, it will open the file
handles one-by-one.

As an aside, you should never use hClose when doing lazy I/O. That's
kind of like solving the above, i've allocated too much memory,
problem with, just overwrite some expensive stuff with some other
cheap stuff to free up space.

-- Dan


On Sun, Mar 17, 2013 at 1:08 AM, C K Kashyap ckkash...@gmail.com wrote:
 Hi,

 I am working on an automation that periodically fetches bug data from our
 bug tracking system and creates static HTML reports. Things worked fine when
 the bugs were in the order of 200 or so. Now I am trying to run it against
 3000 bugs and suddenly I see things like - too  many open handles, out of
 memory etc ...

 Here's the code snippet - http://hpaste.org/84197

 It's a small snippet and I've put in the comments stating how I run into
 out of file handles or simply file not getting read due to lazy IO.

 I realize that putting ($!) using a trial/error approach is going to be
 futile. I'd appreciate some pointers into the tools I could use to get some
 idea of which expressions are building up huge thunks.


 Regards,
 Kashyap

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-17 Thread Petr Pudlák
Hi Kashyap,

you could also use iteratees or conduits for a task like that. The beauty
of such libraries is that they can ensure that a resource is always
properly disposed of. See this simple example:
https://gist.github.com/anonymous/5183107
It prints the first line of each file given as an argument. After each line
is printed, the `fileConduit` pipe ensures that the handle is closed. It
also makes the program nicely composable.

Best regards,
Petr


import Control.Monad
import Control.Monad.Trans.Class
import Control.Monad.IO.Class
import Data.Conduit
import Data.Conduit.List
import System.Environment
import System.IO

{- | Accept file paths on input, output opened file handle, and ensure that the
 - handle is always closed after its downstream pipe finishes whatever
work on it. -}
fileConduit :: MonadResource m = IOMode - Conduit FilePath m Handle
fileConduit mode = awaitForever process
  where
process file = bracketP (openFile file mode) closeWithMsg yield
closeWithMsg h = do
putStrLn Closing file

hClose h

{- | Print the first line from each handle on input. Don't care about
the handle. -}
firstLine :: MonadIO m = Sink Handle m ()
firstLine = awaitForever (liftIO . (hGetLine = putStrLn))

main = do
args - getArgs

runResourceT $ sourceList args =$= fileConduit ReadMode $$ firstLine




2013/3/17 C K Kashyap ckkash...@gmail.com

 Hi,

 I am working on an automation that periodically fetches bug data from our
 bug tracking system and creates static HTML reports. Things worked fine
 when the bugs were in the order of 200 or so. Now I am trying to run it
 against 3000 bugs and suddenly I see things like - too  many open handles,
 out of memory etc ...

 Here's the code snippet - http://hpaste.org/84197

 It's a small snippet and I've put in the comments stating how I run into
 out of file handles or simply file not getting read due to lazy IO.

 I realize that putting ($!) using a trial/error approach is going to be
 futile. I'd appreciate some pointers into the tools I could use to get some
 idea of which expressions are building up huge thunks.


 Regards,
 Kashyap

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-17 Thread C K Kashyap
Thanks everyone,

Dan, MapMI worked for me ...

Regards,
Kashyap


On Mon, Mar 18, 2013 at 12:42 AM, Petr Pudlák petr@gmail.com wrote:

 Hi Kashyap,

 you could also use iteratees or conduits for a task like that. The beauty
 of such libraries is that they can ensure that a resource is always
 properly disposed of. See this simple example:
 https://gist.github.com/anonymous/5183107
 It prints the first line of each file given as an argument. After each
 line is printed, the `fileConduit` pipe ensures that the handle is closed.
 It also makes the program nicely composable.

 Best regards,
 Petr


 import Control.Monad
 import Control.Monad.Trans.Class

 import Control.Monad.IO.Class
 import Data.Conduit

 import Data.Conduit.List
 import System.Environment

 import System.IO


 {- | Accept file paths on input, output opened file handle, and ensure that 
 the
  - handle is always closed after its downstream pipe finishes whatever work 
 on it. -}

 fileConduit :: MonadResource m = IOMode - Conduit FilePath m Handle

 fileConduit mode = awaitForever process

   where
 process file = bracketP (openFile file mode) closeWithMsg yield

 closeWithMsg h = do

 putStrLn Closing file


 hClose h

 {- | Print the first line from each handle on input. Don't care about the 
 handle. -}

 firstLine :: MonadIO m = Sink Handle m ()

 firstLine = awaitForever (liftIO . (hGetLine = putStrLn))


 main = do

 args - getArgs


 runResourceT $ sourceList args =$= fileConduit ReadMode $$ firstLine




 2013/3/17 C K Kashyap ckkash...@gmail.com

 Hi,

 I am working on an automation that periodically fetches bug data from our
 bug tracking system and creates static HTML reports. Things worked fine
 when the bugs were in the order of 200 or so. Now I am trying to run it
 against 3000 bugs and suddenly I see things like - too  many open handles,
 out of memory etc ...

 Here's the code snippet - http://hpaste.org/84197

 It's a small snippet and I've put in the comments stating how I run into
 out of file handles or simply file not getting read due to lazy IO.

 I realize that putting ($!) using a trial/error approach is going to be
 futile. I'd appreciate some pointers into the tools I could use to get some
 idea of which expressions are building up huge thunks.


 Regards,
 Kashyap

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Need some advice around lazy IO

2013-03-16 Thread C K Kashyap
Hi,

I am working on an automation that periodically fetches bug data from our
bug tracking system and creates static HTML reports. Things worked fine
when the bugs were in the order of 200 or so. Now I am trying to run it
against 3000 bugs and suddenly I see things like - too  many open handles,
out of memory etc ...

Here's the code snippet - http://hpaste.org/84197

It's a small snippet and I've put in the comments stating how I run into
out of file handles or simply file not getting read due to lazy IO.

I realize that putting ($!) using a trial/error approach is going to be
futile. I'd appreciate some pointers into the tools I could use to get some
idea of which expressions are building up huge thunks.


Regards,
Kashyap
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe