Re: [Haskell-cafe] The question of ByteString
Duncan Coutts wrote: On Fri, 2007-11-02 at 21:35 +, Andrew Coppin wrote: Well OK, maybe I was a little vague. Let me be a bit more specific... If you do text processing using ByteString rather than String, you get dramatically better performance in time and space. For me, this raises a number of questions: 1. Why do I have to type ByteString in my code? Why isn't the compiler automatically performing this optimisation for me? (I.e., is there some observable property that is changed? Yes, the semantics are different. ByteString is stricter. In some circumstances you could discover that some list is being used sufficiently strictly (spine and element strict) that you could do a representation change to use strict arrays. It is something I have pondered occasionally and I think that is an interesting avenue for research. OK. So I'm not the first person to wonder about this then... ;-) One approach might be to do a more sophisticated strictness analysis However this is likely to be quite fragile. I usually think that it's better to declare the strictness you want up front in one place Yeah, I can imagine. And I guess then you'd be forever wondering hey, did the compiler do that optimisation or not? like with certain current optimisations now. (E.g., did that type get unboxed?) Currently the answer is yes: the ByteString interface only provides trancated Unicode characters. But, in principle, that could be changed.) Indeed it could, we could provide a proper Unicode string type. Likely to happen in the near term? (Not that I imagine this is a huge issue for anybody...) 2. ByteString makes text strings faster. But what about other kinds of collections? Can't we do something similar to them that makes them go faster? There is much less benefit for other collections since the overheads of generic structures are smaller for other types. Note that the NDP parallel arrays stuff uses type functions to calculate optimised data representations for arrays of types. Type... functions...? That sounds pretty scary. :-} I was thinking more immediately about lists in general, but also perhaps binary trees and things. (Although it's probably hard to optimise a general binary tree; you would probably optimise a tree designed for a specific *purpose* instead...) As I understand it, ByteString is faster due to several factors. First of all, it's stricter. So that's the semantic difference. Secondly, it's an unboxed structure. Which is the representation optimisation allowed by the semantic change of making it stricter. Indeed. Third, it's implemented as an array that looks like a linked list. Given how ubiquitous lists are in Haskell, array that looks like a linked list sounds like one seriously useful data type! Yet ByteString seems to be the only implementation of this concept - and only for lists on unboxed bytes. (Not even unboxed Word16 or anything else, *only* Word8.) If I understand this correctly, a ByteString is actually a linked list of large array chunks. (This presumably yields fastER random access than a plain linked list?) Also, it seems to be possible to create a new array which is merely a subrange of an existing one, without any copying; the standard array API doesn't seem to provide this, yet it sounds damn useful. I think the NDP project should get us most of this stuff actually. Cool. So in summary, these kinds of things are on the way, they're just not here quite yet? (BTW, anybody have any clue what's happening with stream fusion? I remember reading the paper saying hey, this is how it works and it's cool and we're going to try to replace the whole list library with new stream implementations, but that's the last I heard...) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Why does GHC limit stack size?
Hello all, Why is there a limitation on the stack size in GHC? Like heap where we can limit the size by -M RTS option but the default is unlimited, why not let the program use as big a stack as required? If not by default, then by a separate option? Some of the functions that we write in recursive fashion will usually cause a stack overflow, but will work fine if there is more stack (suppose we are not worried about efficiency). And these functions generally look nicer and compact than their tail recursive versions. Is this is a technical hurdle, or just a checkpoint for runaway programs? Can we have an RTS flag allowing unlimited stack size? Thanks, Abhay ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Why does GHC limit stack size?
Hello, Why is there a limitation on the stack size in GHC? Like heap where we can limit the size by -M RTS option but the default is unlimited, why not let the program use as big a stack as required? If not by default, then by a separate option? Some of the functions that we write in recursive fashion will usually cause a stack overflow, but will work fine if there is more stack (suppose we are not worried about efficiency). And these functions generally look nicer and compact than their tail recursive versions. Is this is a technical hurdle, or just a checkpoint for runaway programs? This was discussed a while ago on the ghc users mailing list. I think there was general agreement that this was bad, but that doing something better meant a lot of work for someone (who could be trusted to get it right :-) http://www.haskell.org/pipermail/glasgow-haskell-users/2007-May/012467.html Regards -- Adrian Hey ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Why does GHC limit stack size?
Bulat Ziganshin wrote: because program that require 8mb stack, will probably require 8gb when processing more data :) So.. what? You could say the same about heap, which was rather the point of the earlier thread. Regards -- Adrian Hey ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The question of ByteString
Andrew Coppin wrote: Duncan Coutts wrote: Yes, the semantics are different. ByteString is stricter. In some circumstances you could discover that some list is being used sufficiently strictly (spine and element strict) that you could do a representation change to use strict arrays. It is something I have pondered occasionally and I think that is an interesting avenue for research. Also important is the fact that String = [Char], and a Char can hold any Unicode character, whereas a ByteString is a sequence of Word8 elements (i.e. integers from 0-255). If you want to store a Char in a ByteString then you have to convert it to UTF-8 or something similar. The Encoding package (http://hackage.haskell.org/cgi-bin/hackage-scripts/package/encoding-0.2) has the functions for this kind of thing. Paul. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Why can't Haskell be faster?
On Nov 3, 2007 5:00 AM, Ryan Dickie [EMAIL PROTECTED] wrote: Lossless File compression, AKA entropy coding, attempts to maximize the amount of information per bit (or byte) to be as close to the entropy as possible. Basically, gzip is measuring (approximating) the amount of information contained in the code. Hmmm, interesting idea. I think it would be interesting to compare the ratios between raw file size its entropy (we can come up with a precise metric later). This would show us how concise the language and code actually is. Yeah, let's all write in bytecode using a hex editor :-D ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Compile-time evaluation
Hi Nicholas, compileTimeEval :: Data a = a - ExpQ compileTimeEval = return . toExp You're telling me all that horrendous pain in implementing toExp and it already exists?!? Yes unfortunately, compileTimeEval already exists in TH, it's called lift compileTimeEval :: Lift a = a - ExpQ compileTimeEval = lift But don't be so hard on yourself. Your approach has one advantage. GHC supports automatic derivation of Data whereas Lift instances have to be created manually. Note, however, that Lift instances can also be generated using Igloo's th-lift package[1]. Cheers, Fons [1] http://hackage.haskell.org/cgi-bin/hackage-scripts/package/th-lift-0.2 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The question of ByteString
On Nov 3, 2007, at 5:34 , Andrew Coppin wrote: (BTW, anybody have any clue what's happening with stream fusion? I remember reading the paper saying hey, this is how it works and it's cool and we're going to try to replace the whole list library with new stream implementations, but that's the last I heard...) dons and I have both mentioned Data.Stream recently; I think it's on hackage. It's still a work in progress though, I believe. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED] system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED] electrical and computer engineering, carnegie mellon universityKF8NH ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Rose Tree
Hello, I need help... I am having trouble with rose trees. 1 2 3 4 5 6 7 8910 11 That is the rose tree that I seek. Data Tree a = Empty | Leaf a | Node a [(Tree a)] example :: Tree (String, String) example = Node (a,b) -- root node [ ] -- end of tree What I want to do is create two functions that return either the children or parents of a given input, here a String. String - Tree - [String] I think _ The next generation of MSN Hotmail has arrived - Windows Live Hotmail http://www.newhotmail.co.uk___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] torrent for 6.8.1?
do torrents exist for 6.8.1? my experience is that people will use torrents if they are offered and they really do lift the pressure from the origin domain (haskell.org) pgpTQvjxj07Kq.pgp Description: PGP signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe