+Simon Marlow A couple of comments: - maybe we shouldn't back the file by a Handle. io-streams does this by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow going by Handle seems to be I may revisit this, I figured it would be "good enough". - io-streams turns Handle buffering off in withFileAsOutput. So the difference shouldn't be as a result of buffering. Simon: is this an expected result? I presume you did some Handle debugging? - the IO manager should not have any bearing here because file code doesn't actually ever use it (epoll() doesn't work for files) - does the difference persist when the file size gets bigger? - your file descriptor code doesn't handle EINTR properly, although you said you checked that the file copy is being done? - Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other methods have a more believable ~70MB/s throughput.
G On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman <[email protected]> wrote: > Hi all, > > I'm turning to the community for some help understanding some benchmark > results[1]. I was curious to see how the new io-streams would work with > conduit, as it looks like a far saner low-level approach than Handles. In > fact, the API is so simple that the entire wrapper is just a few lines of > code[2]. > > I then added in some basic file copy benchmarks, comparing conduit+Handle > (with ResourceT or bracket), conduit+io-streams, straight io-streams, and > lazy I/O. All approaches fell into the same ballpark, with conduit+bracket > and conduit+io-streams taking a slight lead. (I haven't analyzed that > enough to know if it means anything, however.) > > Then I decided to pull up the NoHandle code I wrote a while ago for > conduit. This code was written initially for Windows only, to work around > the fact that System.IO.openFile does some file locking. To avoid using > Handles, I wrote a simple FFI wrapper exposing open, read, and close system > calls, ported it to POSIX, and hid it behind a Cabal flag. Out of > curiosity, I decided to expose it and include it in the benchmark. > > The results are extreme. I've confirmed multiple times that the copy > algorithm is in fact copying the file, so I don't think the test itself is > cheating somehow. But I don't know how to explain the massive gap. I've run > this on two different systems. The results you see linked are from my local > machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle > code was still 75% faster than the others. > > My initial guess is that I'm not properly tying into the IO manager, but I > wanted to see if the community had any thoughts. The relevant pieces of > code are [3][4][5]. > > Michael > > [1] http://static.snoyman.com/streams.html > [2] > https://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Conduit/Streams.hs > [3] > https://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hsc > [4] > https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L54 > [5] > https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L167 > > _______________________________________________ > Haskell-Cafe mailing list > [email protected] > http://www.haskell.org/mailman/listinfo/haskell-cafe > > -- Gregory Collins <[email protected]>
_______________________________________________ Haskell-Cafe mailing list [email protected] http://www.haskell.org/mailman/listinfo/haskell-cafe
