Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-09 Thread John Lato
On Fri, Mar 8, 2013 at 6:36 PM, Simon Marlow marlo...@gmail.com wrote: 1GB/s for copying a file is reasonable - it's around half the memory bandwidth, so copying the data twice would give that result (assuming no actual I/O is taking place, which is what you want because actual I/O will swamp

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-09 Thread Michael Snoyman
Just to clarify: the problem was in fact with my code, I was not passing O_TRUNC to the open system call. Gregory's C code showed me the problem. Once I add in that option, all the different benchmarks complete in roughly the same amount of time. So given that our Haskell implementations based on

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-08 Thread Gregory Collins
On Fri, Mar 8, 2013 at 9:36 AM, Gregory Collins g...@gregorycollins.netwrote: I presume you did some Handle debugging? ...and here I mean benchmarking of course. -- Gregory Collins g...@gregorycollins.net ___ Haskell-Cafe mailing list

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-08 Thread Gregory Collins
+Simon Marlow A couple of comments: - maybe we shouldn't back the file by a Handle. io-streams does this by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-08 Thread John Lato
I'd like to point out that it's entirely possible to get good performance out of a handle. The iteratee package has had both FD and Handle-based IO for a while, and I've never observed any serious performance differences between the two. Also, if I may be so bold, Michael's supercharged copy

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-08 Thread Gregory Collins
On Fri, Mar 8, 2013 at 9:48 AM, John Lato jwl...@gmail.com wrote: For comparison, on my system I get $ time cp input.dat output.dat real 0m0.004s user 0m0.000s sys 0m0.000s Does your workstation have an SSD? Michael's using a spinning disk. -- Gregory Collins g...@gregorycollins.net

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-08 Thread Alexander Kjeldaas
On Fri, Mar 8, 2013 at 9:53 AM, Gregory Collins g...@gregorycollins.netwrote: On Fri, Mar 8, 2013 at 9:48 AM, John Lato jwl...@gmail.com wrote: For comparison, on my system I get $ time cp input.dat output.dat real 0m0.004s user 0m0.000s sys 0m0.000s Does your workstation have an SSD?

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-08 Thread Gregory Collins
Something must be wrong with the conduit NoHandle code. I increased the filesize to 60MB and implemented the copy loop in pure C, the code and results are here: https://gist.github.com/gregorycollins/5115491 Everything but the conduit NoHandle code runs in roughly 600-620ms, including the pure C

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-08 Thread Michael Snoyman
That demonstrated the issue: I'd forgotten to pass O_TRUNC to the open system call. Adding that back makes the numbers much more comparable. Thanks for the input everyone, and Gregory for finding the actual problem (as well as pointing out a few other improvements). On Fri, Mar 8, 2013 at 12:13

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-08 Thread Simon Marlow
1GB/s for copying a file is reasonable - it's around half the memory bandwidth, so copying the data twice would give that result (assuming no actual I/O is taking place, which is what you want because actual I/O will swamp any differences at the software level). The Handle overhead should be

[Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-07 Thread Michael Snoyman
Hi all, I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-07 Thread Michael Snoyman
One clarification: it seems that sourceFile and sourceFileNoHandle have virtually no difference in speed. The gap comes exclusively from sinkFile vs sinkFileNoHandle. This makes me think that it might be a buffer copy that's causing the slowdown, in which case the benchmark may in fact be

Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)

2013-03-07 Thread John Lato
I would have expected sourceFileNoHandle to make the most difference, since that's one location (write) where you've obviously removed a copy. Does sourceFileNoHandle allocate less? Incidentally, I've recently been making similar changes to IO code (removing buffer copies) and getting similar