Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)
On Fri, Mar 8, 2013 at 6:36 PM, Simon Marlow marlo...@gmail.com wrote: 1GB/s for copying a file is reasonable - it's around half the memory bandwidth, so copying the data twice would give that result (assuming no actual I/O is taking place, which is what you want because actual I/O will swamp any differences at the software level). The Handle overhead should be negligible if you're only using hGetBufSome and hPutBuf, because those functions basically just call read() and write() when the amount of data is larger than the buffer size. There's clearly something suspicious going on here, unfortunately I don't have time right now to investigate, but I'll keep an eye on the thread. Possibly disk caching/syncing issues? If some of the tests are able to either read entirely from cache (on the 1MB test), or don't completely sync after the write, they could happen much faster than others that have to actually hit the disk. For the 60MB test, it's almost guaranteed that actual IO would take place and dominate the timings. John L. Cheers, Simon On 08/03/13 08:36, Gregory Collins wrote: +Simon Marlow A couple of comments: * maybe we shouldn't back the file by a Handle. io-streams does this by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow going by Handle seems to be I may revisit this, I figured it would be good enough. * io-streams turns Handle buffering off in withFileAsOutput. So the difference shouldn't be as a result of buffering. Simon: is this an expected result? I presume you did some Handle debugging? * the IO manager should not have any bearing here because file code doesn't actually ever use it (epoll() doesn't work for files) * does the difference persist when the file size gets bigger? * your file descriptor code doesn't handle EINTR properly, although you said you checked that the file copy is being done? * Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other methods have a more believable ~70MB/s throughput. G On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman mich...@snoyman.com mailto:mich...@snoyman.com wrote: Hi all, I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2]. I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.) Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark. The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others. My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5]. Michael [1] http://static.snoyman.com/**streams.htmlhttp://static.snoyman.com/streams.html [2] https://github.com/snoyberg/**conduit/blob/streams/io-** streams-conduit/Data/Conduit/**Streams.hshttps://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Conduit/Streams.hs [3] https://github.com/snoyberg/**conduit/blob/streams/conduit/** System/PosixFile.hschttps://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hsc [4] https://github.com/snoyberg/**conduit/blob/streams/conduit/** Data/Conduit/Binary.hs#L54https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L54 [5] https://github.com/snoyberg/**conduit/blob/streams/conduit/** Data/Conduit/Binary.hs#L167https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L167
Re: [Haskell-cafe] To seq or not to seq, that is the question
On Fri, Mar 08, 2013 at 08:53:15PM -0800, Edward Z. Yang wrote: Are these equivalent? If not, under what circumstances are they not equivalent? When should you use each? evaluate a return b [...] - Use 'evaluate' when you mean to say, Evaluate this thunk to HNF before doing any other IO actions, please. Use it as much as possible in IO. I've never looked at evaluate before but I've just found it's haddock and given it some thought. http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Exception-Base.html#v:evaluate Since it is asserted that evaluate x = (return $! x) = return is it right to say (on an informal level at least) that evaluating an IO action to WHNF means evaluating it to the outermost = or return? For non-IO monads, since everything is imprecise anyway, it doesn't matter. Could you explain what you mean by imprecise? Tom ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ACCAT 2013: final call for participation and invitation to discussion
[Apologies if you receive more than one copy of the following announcement] FINAL CALL FOR PARTICIPATION AND INVITATION TO DISCUSSION == 8th International Workshop on Applied and Computational Category Theory ACCAT 2013 http://accat2013.zib.de/ Satellite Event of ETAPS 2013, Rome, March 17 2013 == Deadline for online registration: March 8 *Attendees are warmly invited to join the closing discussion. Please communicate your intention to deliver a position statement to the organizers.* == Since the 1960s, the use of category theory in computer science has been a fruitful one, ranging form automata theory to algebraic specification to programming languages. In recent years techniques and methods from CT have been adopted as a standard research tool, and considered as such in different venues around the world. The ACCAT workshop on Applied and Computational Category Theory has been one of these venues. Since its inception in 2006, ACCAT provided a forum where invited contributors presented their own research on different facets of category theory applied to computer science. Despite ACCAT success, we believe that the formula should be revised. Indeed, we have the feeling that a conference is missing where all kinds of applications of category theory to computer science can be presented (like the former CTCS conference, which somehow ended in 2006). This year, we would like to use the ACCAT forum to raise this issue and to discuss it within a larger audience. Thus, we invited 8 top researched in the area of application of category theory. The list of speakers are Samson Abramsky Robin B. Cockett Barbara Koenig Ugo Montanari Till Mossakowski Dusko Pavlovic Andrzej Tarlecki Glynn Winskel We do hope that the meeting will be fruitful, providing a good exchange of ideas and planting the seed for future events. Indeed, one of the outcome of the meeting is to decide whether to push an high-level workshop/conference on the application of category theory to computer science, or at least to verify the viability of a Daghstul meeting on the issue. Therefore, after the presentations, the workshop will end up with a general discussion among the attendees. -- Dr. Ulrike Golas Konrad-Zuse-Zentrum für Informationstechnik Berlin Takustr. 7, 14195 Berlin, Germany Tel. +49 30 84185 - 318 go...@zib.de -- www.zib.de/golas ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: hF2-0.2
Hi, this is the second release of hF2, a F(2^e) backend for cryptographic code, to be found at http://hackage.haskell.org/package/hF2 (or simply by cabal install hF2) This library is used in hecc for elliptic cryptography on binary field curves and came into existence during my master thesis. Since the code from back then some speedups and changes to data representation were made which lead to an increase in speed from the first correct protoype to this release by the factor 10^86 on my main development machine. Sadly, this is still slower than pure C or Assembler versions, but a lot more portable, (arguably) easier to read and easier to parallelize. The code does automatic bit slicing and uses mainly the vector library as a fast backend. Feats of this release: - Speed (256 bit curve point multiplication in hecc is now at about a second in time) - LINEAR speedup in threaded execution with the number of cores (up to the number of bits divided by the wordsize) - Manually tested - Prelude-like Interface (more in progress) - Mostly timing attack resistant (also in progress) Next up: Testing and fixing hecc for this release. Have a nice weekend, Marcel -- Marcel Fourné OpenPGP-Key-ID: 4991 8AA4 202F 12AC 41F7 6C77 CA83 BDF0 7454 5C72 I am a programmer. I want languages and libraries, not just huge tools. signature.asc Description: PGP signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Shake, Shelly, FilePath
Shelly is using system-filepath which was created as an improvement over using a simple String. For a build system in which you name all your files you may not care about the upside. If you want a version of Shelly that uses String you can try Shellish, its predecessor. Otherwise the shim should be written for Shake to use system-filepath. Greg Weber ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] File I/O benchmark help (conduit, io-streams and Handle)
Just to clarify: the problem was in fact with my code, I was not passing O_TRUNC to the open system call. Gregory's C code showed me the problem. Once I add in that option, all the different benchmarks complete in roughly the same amount of time. So given that our Haskell implementations based on Handle are just about as fast as a raw C implementation, I'd say Handle is performing very well. Apologies if I got anyone overly concerned. On Fri, Mar 8, 2013 at 12:36 PM, Simon Marlow marlo...@gmail.com wrote: 1GB/s for copying a file is reasonable - it's around half the memory bandwidth, so copying the data twice would give that result (assuming no actual I/O is taking place, which is what you want because actual I/O will swamp any differences at the software level). The Handle overhead should be negligible if you're only using hGetBufSome and hPutBuf, because those functions basically just call read() and write() when the amount of data is larger than the buffer size. There's clearly something suspicious going on here, unfortunately I don't have time right now to investigate, but I'll keep an eye on the thread. Cheers, Simon On 08/03/13 08:36, Gregory Collins wrote: +Simon Marlow A couple of comments: * maybe we shouldn't back the file by a Handle. io-streams does this by default out of the box; I had a posix file interface for unix (guarded by CPP) for a while but decided to ditch it for simplicity. If your results are correct, given how slow going by Handle seems to be I may revisit this, I figured it would be good enough. * io-streams turns Handle buffering off in withFileAsOutput. So the difference shouldn't be as a result of buffering. Simon: is this an expected result? I presume you did some Handle debugging? * the IO manager should not have any bearing here because file code doesn't actually ever use it (epoll() doesn't work for files) * does the difference persist when the file size gets bigger? * your file descriptor code doesn't handle EINTR properly, although you said you checked that the file copy is being done? * Copying a 1MB file in 1ms gives a throughput of ~1GB/s. The other methods have a more believable ~70MB/s throughput. G On Fri, Mar 8, 2013 at 7:30 AM, Michael Snoyman mich...@snoyman.com mailto:mich...@snoyman.com wrote: Hi all, I'm turning to the community for some help understanding some benchmark results[1]. I was curious to see how the new io-streams would work with conduit, as it looks like a far saner low-level approach than Handles. In fact, the API is so simple that the entire wrapper is just a few lines of code[2]. I then added in some basic file copy benchmarks, comparing conduit+Handle (with ResourceT or bracket), conduit+io-streams, straight io-streams, and lazy I/O. All approaches fell into the same ballpark, with conduit+bracket and conduit+io-streams taking a slight lead. (I haven't analyzed that enough to know if it means anything, however.) Then I decided to pull up the NoHandle code I wrote a while ago for conduit. This code was written initially for Windows only, to work around the fact that System.IO.openFile does some file locking. To avoid using Handles, I wrote a simple FFI wrapper exposing open, read, and close system calls, ported it to POSIX, and hid it behind a Cabal flag. Out of curiosity, I decided to expose it and include it in the benchmark. The results are extreme. I've confirmed multiple times that the copy algorithm is in fact copying the file, so I don't think the test itself is cheating somehow. But I don't know how to explain the massive gap. I've run this on two different systems. The results you see linked are from my local machine. On an EC2 instance, the gap was a bit smaller, but the NoHandle code was still 75% faster than the others. My initial guess is that I'm not properly tying into the IO manager, but I wanted to see if the community had any thoughts. The relevant pieces of code are [3][4][5]. Michael [1] http://static.snoyman.com/**streams.htmlhttp://static.snoyman.com/streams.html [2] https://github.com/snoyberg/**conduit/blob/streams/io-** streams-conduit/Data/Conduit/**Streams.hshttps://github.com/snoyberg/conduit/blob/streams/io-streams-conduit/Data/Conduit/Streams.hs [3] https://github.com/snoyberg/**conduit/blob/streams/conduit/** System/PosixFile.hschttps://github.com/snoyberg/conduit/blob/streams/conduit/System/PosixFile.hsc [4] https://github.com/snoyberg/**conduit/blob/streams/conduit/** Data/Conduit/Binary.hs#L54https://github.com/snoyberg/conduit/blob/streams/conduit/Data/Conduit/Binary.hs#L54 [5] https://github.com/snoyberg/**conduit/blob/streams/conduit/**
[Haskell-cafe] Overloading
Hi, I just started playing around a bit with Haskell, so sorry in advance for very basic (and maybe stupid) questions. Coming from the C++ world one thing I would like to do is overloading operators. For example I want to write (Date 6 6 1973) + (Period 2 Months) for some self defined types Date and Period. Another example would be (Period 1 Years) + (Period 3 Months). Just defining the operator (+) does not work because it collides with Prelude.+. I assume using fully qualified names would work, but that is not what I want. So maybe make the types instances of typeclasses? This would be Num for (+) I guess. For the first example above it will not work however, alone for it is not of type a - a - a. Also the second example does not fit, because I would have to make Period an instance of Num, which does not make sense, because I can not multiply Periods (for example). Am I missing something or is that what I am trying here just impossible by the language design (and then probably for a good reason) ? A second question concerns the constructors in own datatypes like Date above. Is it possible to restrict the construction of objects to sensible inputs, i.e. reject something like Date 50 23 2013 ? My workaround would be to provide a function say date :: Int-Int-Int-Date checking the input and returning a Date object or throw an error if the input does not correspond to a real date. I could then hide the Date constructor itself (by not exporting it). However this seems not really elegant. Also again, taking this way I can not provide several constructors taking inputs of different types, can I ? Thanks a lot Peter ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Overloading
On Mar 10, 2013, at 12:33 AM, Peter Caspers pcaspers1...@gmail.com wrote: Hi, I just started playing around a bit with Haskell, so sorry in advance for very basic (and maybe stupid) questions. Coming from the C++ world one thing I would like to do is overloading operators. For example I want to write (Date 6 6 1973) + (Period 2 Months) for some self defined types Date and Period. Another example would be (Period 1 Years) + (Period 3 Months). Just defining the operator (+) does not work because it collides with Prelude.+. I assume using fully qualified names would work, but that is not what I want. So maybe make the types instances of typeclasses? This would be Num for (+) I guess. For the first example above it will not work however, alone for it is not of type a - a - a. Also the second example does not fit, because I would have to make Period an instance of Num, which does not make sense, because I can not multiply Periods (for example). If you really want that, you can stop ghc from importing Prelude. I haven't tested it yet, but I think import Prelude hiding (Num) should work. Of course, in this case you would lose all predefined instances of Num, including the ability to add integers, but you can get them back through another module. But I would strongly suggest that you define another operator instead. Unlike C++, Haskell allows you to define as many operators as you like. Am I missing something or is that what I am trying here just impossible by the language design (and then probably for a good reason) ? A second question concerns the constructors in own datatypes like Date above. Is it possible to restrict the construction of objects to sensible inputs, i.e. reject something like Date 50 23 2013 ? My workaround would be to provide a function say date :: Int-Int-Int-Date checking the input and returning a Date object or throw an error if the input does not correspond to a real date. I could then hide the Date constructor itself (by not exporting it). However this seems not really elegant. Well, it's the way it is usually done. This is called a smart constructor pattern. Also again, taking this way I can not provide several constructors taking inputs of different types, can I ? Sorry, didn't get what you mean here. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Overloading
On Mar 9, 2013, at 3:33 PM, Peter Caspers pcaspers1...@gmail.com wrote: Hi, I just started playing around a bit with Haskell, so sorry in advance for very basic (and maybe stupid) questions. Coming from the C++ world one thing I would like to do is overloading operators. For example I want to write (Date 6 6 1973) + (Period 2 Months) for some self defined types Date and Period. Another example would be (Period 1 Years) + (Period 3 Months). So maybe make the types instances of typeclasses? This would be Num for (+) I guess. For the first example above it will not work however, alone for it is not of type a - a - a. Also the second example does not fit, because I would have to make Period an instance of Num, which does not make sense, because I can not multiply Periods (for example). Am I missing something or is that what I am trying here just impossible by the language design (and then probably for a good reason) ? Take a look at affine spaces and additive groups in the vector-space package. There may be other treatments of torsors on hackage, but vector-space has a fairly straightforward approach. A second question concerns the constructors in own datatypes like Date above. Is it possible to restrict the construction of objects to sensible inputs, i.e. reject something like Date 50 23 2013 ? My workaround would be to provide a function say date :: Int-Int-Int-Date checking the input and returning a Date object or throw an error if the input does not correspond to a real date. I could then hide the Date constructor itself (by not exporting it). However this seems not really elegant. Also again, taking this way I can not provide several constructors taking inputs of different types, can I ? This approach -- hiding data constructors and exporting functions that perform validation -- is called smart constructors, and is accepted practice. It isn't entirely satisfying due to interfering with pattern matching in client code, so you either need to work with projection functions for your data type, or use ViewPatterns to provide a more transparent record type at use sites. Anthony Thanks a lot Peter ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] To seq or not to seq, that is the question
Excerpts from Tom Ellis's message of Sat Mar 09 00:34:41 -0800 2013: I've never looked at evaluate before but I've just found it's haddock and given it some thought. http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Exception-Base.html#v:evaluate Since it is asserted that evaluate x = (return $! x) = return is it right to say (on an informal level at least) that evaluating an IO action to WHNF means evaluating it to the outermost = or return? Sure. Prelude let x = undefined :: IO a Prelude x `seq` () *** Exception: Prelude.undefined Prelude (x = undefined) `seq` () () For non-IO monads, since everything is imprecise anyway, it doesn't matter. Could you explain what you mean by imprecise? Imprecise as in imprecise exceptions, http://research.microsoft.com/en-us/um/people/simonpj/papers/imprecise-exn.htm Edward ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Overloading
Also again, taking this way I can not provide several constructors taking inputs of different types, can I ? You can have multiple constructors, taking different numbers and types of input parameters, yes. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Overloading
Thank you all for your answers, this helps a lot. To clarify my last point ... Also again, taking this way I can not provide several constructors taking inputs of different types, can I ? Sorry, didn't get what you mean here. In C++ it is perfectly normal to have overloaded functions like f : Int - Int - Int f : Int - Char - Int in coexistence, because the compiler can infer (at compile time) what function to call by looking at the arguments types. In Haskell I think this is not possible simply due to the flexibility given by partial function application, i.e. f 5 would not be well defined any more, it could be Int - Int or Char - Int. Thanks again and kind regards Peter ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe