I'd like to point out that it's entirely possible to get good performance out of a handle. The iteratee package has had both FD and Handle-based IO for a while, and I've never observed any serious performance differences between the two. Also, if I may be so bold, Michael's supercharged copy speeds are on par with iteratee's performance using Handles: http://www.tiresiaspress.us/io-benchmarks.html
So while there's definitely something interesting going on here, I think it needs a bit more investigation before suggesting that Handles should be avoided. For comparison, on my system I get $ time cp input.dat output.dat real 0m0.004s user 0m0.000s sys 0m0.000s so the throughput observed on the faster times is entirely reasonable. John L. On Fri, Mar 8, 2013 at 4:36 PM, Gregory Collins <g...@gregorycollins.net>wrote: > +Simon Marlow > A couple of comments: > > - maybe we shouldn't back the file by a Handle. io-streams does this > by default out of the box; I had a posix file interface for unix (guarded > by CPP) for a while but decided to ditch it for simplicity. If your results > are correct, given how slow going by Handle seems to be I may revisit this, > I figured it would be "good enough". > - io-streams turns Handle buffering off in withFileAsOutput. So the > difference shouldn't be as a result of buffering. Simon: is this an > expected result? I presume you did some Handle debugging? > - the IO manager should not have any bearing here because file code > doesn't actually ever use it (epoll() doesn't work for files) > - does the difference persist when the file size gets bigger? > - your file descriptor code doesn't handle EINTR properly, although > you said you checked that the file copy is being done? > - Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other > methods have a more believable ~70MB/s throughput. > > G > > > On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman <mich...@snoyman.com>wrote: > >> Hi all, >> >> I'm turning to the community for some help understanding some benchmark >> results[1]. I was curious to see how the new io-streams would work with >> conduit, as it looks like a far saner low-level approach than Handles. In >> fact, the API is so simple that the entire wrapper is just a few lines of >> code[2]. >> >> I then added in some basic file copy benchmarks, comparing conduit+Handle >> (with ResourceT or bracket), conduit+io-streams, straight io-streams, and >> lazy I/O. All approaches fell into the same ballpark, with conduit+bracket >> and conduit+io-streams taking a slight lead. (I haven't analyzed that >> enough to know if it means anything, however.) >> >> Then I decided to pull up the NoHandle code I wrote a while ago for >> conduit. This code was written initially for Windows only, to work around >> the fact that System.IO.openFile does some file locking. To avoid using >> Handles, I wrote a simple FFI wrapper exposing open, read, and close system >> calls, ported it to POSIX, and hid it behind a Cabal flag. Out of >> curiosity, I decided to expose it and include it in the benchmark. >> >> The results are extreme. I've confirmed multiple times that the copy >> algorithm is in fact copying the file, so I don't think the test itself is >> cheating somehow. But I don't know how to explain the massive gap. I've run >> this on two different systems. The results you see linked are from my local >> machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle >> code was still 75% faster than the others. >> >> My initial guess is that I'm not properly tying into the IO manager, but >> I wanted to see if the community had any thoughts. The relevant pieces of >> code are [3][4][5]. >> >> Michael >> >> [1] http://static.snoyman.com/streams.html >> [2] >> https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Conduit/Streams.hs >> [3] >> https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hsc >> [4] >> https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L54 >> [5] >> https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L167 >> >> _______________________________________________ >> Haskell-Cafe mailing list >> Haskell-Cafe@haskell.org >> http://www.haskell.org/mailman/listinfo/haskell-cafe >> >> > > > -- > Gregory Collins <g...@gregorycollins.net> > > _______________________________________________ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > >
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe