Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Ketil Malde [EMAIL PROTECTED] writes: data EvidenceCode = IAC | IUG | IFR | NAC | NR | ... deriving Show Could it be that this derived read instance is somehow very inefficient? To answer my own question: this is exactly it, ghc derives less than optimal code in this case. Rather than reiterate the case here, I did a quick writeup at http://blog.malde.org/, and would be quite happy about any feedback or suggestions. -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Ketil Malde wrote: Ketil Malde [EMAIL PROTECTED] writes: data EvidenceCode = IAC | IUG | IFR | NAC | NR | ... deriving Show Could it be that this derived read instance is somehow very inefficient? To answer my own question: this is exactly it, ghc derives less than optimal code in this case. Rather than reiterate the case here, I did a quick writeup at http://blog.malde.org/, and would be quite happy about any feedback or suggestions. I think you'll find the code that GHC derives for a Read instance handles extra whitespace and all kinds of other whatnot that you don't actually need in this specific case. I suspect this is what's up - but I don't have any hard proof for that. ;-) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Andrew Coppin wrote: Ketil Malde wrote: Ketil Malde [EMAIL PROTECTED] writes: data EvidenceCode = IAC | IUG | IFR | NAC | NR | ... deriving Show Could it be that this derived read instance is somehow very inefficient? To answer my own question: this is exactly it, ghc derives less than optimal code in this case. Rather than reiterate the case here, I did a quick writeup at http://blog.malde.org/, and would be quite happy about any feedback or suggestions. I think you'll find the code that GHC derives for a Read instance handles extra whitespace and all kinds of other whatnot that you don't actually need in this specific case. I suspect this is what's up - but I don't have any hard proof for that. ;-) I wrote three programs: One does data Tag = Orange | Lemon | Lime | Apple | Banana | Pear | Peach deriving Read The other two use get_tag :: String - Tag get_tag Orange = Orange get_tag Lemon = Lemon get_tag Lime = Lime get_tag Apple = Apple get_tag Banana = Banana get_tag Pear = Pear get_tag Peach = Peach get_tag _= error not a tag and get_tag :: String - Tag get_tag xs = case xs of [] - bad (x:xs1) - case x of 'A' - case xs1 of pple - Apple _ - bad 'B' - case xs1 of anana - Banana _ - bad 'L' - case xs1 of emon - Lemon ime - Lime _ - bad 'O' - case xs1 of range - Orange _ - bad 'P' - case xs1 of ('e':'r':xs2) - case xs2 of r - Pear ch - Peach _- bad _ - bad _ - bad bad = error not a tag I wrote a driver program that reads a file of 1,000,000 tag values. Using the first version (GHC-derived Read instance) it took about 32 seconds to process. Using the second version (just a bunch of strings to match, no cleaverness at all) took approximately 1 second. The 3rd version was so fast I didn't have time to see the window open before it closed again. Note that all of this was using plain ordinary Strings, not ByteString or anything fancy like that. Note also that the actual documentation for the Prelude even says that Read is very slow. [Although it says it's slow for reading large structures, not large numbers of trivial structures.] None of this is really all that surprising; in the general case, a Read instance might have to skip over spaces or parse deeply nested brackets or any number of other things. If you know you don't need to handle all those cases, write your own parser. It shouldn't be hard to come up with something faster. [Altough obviously it's kinda tedious.] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Andrew Coppin wrote: [...] I think you'll find the code that GHC derives for a Read instance handles extra whitespace and all kinds of other whatnot that you don't actually need in this specific case. I suspect this is what's up - but I don't have any hard proof for that. ;-) Parentheses are handled as well. It's worse than that - derived read instances are defined in terms of 'lex' (via lexP in GHC.Read) which, among other things, recognizes numerical and string constants. The latter has the odd effect that with the following declaration, data A = A deriving (Read, Show) the expression read ('' : repeat ' ') :: A is the wrong kind of bottom - instead of a parse error, you get an infinite loop. Bertram ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Don Stewart [EMAIL PROTECTED] writes: mkAnn :: ByteString - Annotation mkAnn = pick . B.words where pick (_db:up:rest) = pick' up $ getGo rest pick' up' (go:_:ev:_) = Ann (B.copy up') (read $ B.unpack go) (read $ B.unpack ev) getGo = dropWhile (not . B.isPrefixOf (pack GO:)) read $ B.unpack go Looks suspicious. You're unpacking to lists. ByteString performance rule 1: don't unpack to lists. I tend to use this idiom a bit when I want to loop over the characters. The strings being unpacked is an Int and a short (two or three letter) identifier. Doing a 'go' loop would probably be faster, but a lot more work, and I was hoping the String would be deforested or fused or otherwise optimized to the bone. I wonder if the culprit is the last 'read', it reads one from a set of keywords/identifiers, and since they're upper case, I just made a data type with a matching set of nullary constructors, and derived Read for it. I.e: data EvidenceCode = IAC | IUG | IFR | NAC | NR | ... deriving Show Could it be that this derived read instance is somehow very inefficient? -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Write Haskell as fast as C.
Andrew Coppin [EMAIL PROTECTED] writes: Look closer: it's hardER to read. mean xs = sum xs / fromIntegral (length xs) mean = go 0 0 n where go s l x | x m = s / fromIntegra l | otherwise = go (s+x) (l+1) (x+1 One version makes it instantly clear, at a glance, what is happening. The other requires you to mentally walk round a look, imperative style, to figure out what's happening. It's not a *big* deal, but it's unfortunate. I am new to Haskell and when I first saw the two versions side by side I too was disappointed with the second version. But after reading the great blog post by Don, I realized that the whole problem comes from the fact that lists in Haskell are not like arrays or vectors in other languages: you don't know how long they are before you have found the end. In other words: they behave like a linked lists -- lists that are generated lazily on demand. Because they are generated on demand you *cannot* really know the length beforehand, and thus you *must* traverse the list to its end to count the length. When the list is too big to fit in memory then it's clear that the code *must* let go of the beginning to allow the garbage collector to do its job. You wouldn't be able to work with a 7.5 GiB linked list otherwise. -- Martin Geisler VIFF (Virtual Ideal Functionality Framework) brings easy and efficient SMPC (Secure Multi-Party Computation) to Python. See: http://viff.dk/. pgpgElb902o1M.pgp Description: PGP signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
Don Stewart wrote: I've written an extended post on how to understand and reliably optimise code like this, looking at it all the way down to the assembly. The result are some simple rules to follow for generated code as good as gcc -O2. Enjoy, http://cgi.cse.unsw.edu.au/~dons/blog/2008/05/16#fast A well-written piece, as always. My feelings are ambivilent. On the one hand, it's reassuring that such good performance can be obtained without resorting to calling C, explicit unboxed types, GHC-specific hacks, strictness annotations, manual seq calls, strange case expressions, or really anything remotely odd. It's fairly plain Haskell '98 that most beginners would be able to read through and eventually understand. And yet it's fast. On the other hand, this is the anti-theisis of Haskell. We start with a high-level, declarative program, which performs horribly, and end up with a manually hand-optimised blob that's much harder to read but goes way faster. Obviously most people would prefer to write declarative code and feel secure that the compiler is going to produce something efficient. If the muse takes me, maybe I'll see if I can't find a less ugly way to do this... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
andrewcoppin: Don Stewart wrote: I've written an extended post on how to understand and reliably optimise code like this, looking at it all the way down to the assembly. The result are some simple rules to follow for generated code as good as gcc -O2. Enjoy, http://cgi.cse.unsw.edu.au/~dons/blog/2008/05/16#fast A well-written piece, as always. My feelings are ambivilent. On the one hand, it's reassuring that such good performance can be obtained without resorting to calling C, explicit unboxed types, GHC-specific hacks, strictness annotations, manual seq calls, strange case expressions, or really anything remotely odd. It's fairly plain Haskell '98 that most beginners would be able to read through and eventually understand. And yet it's fast. On the other hand, this is the anti-theisis of Haskell. We start with a high-level, declarative program, which performs horribly, and end up with a manually hand-optimised blob that's much harder to read but goes way faster. Obviously most people would prefer to write declarative code and feel secure that the compiler is going to produce something efficient. If the muse takes me, maybe I'll see if I can't find a less ugly way to do this... I don't understand what's ugly about: go s l x | x m = s / fromIntegral l | otherwise = go (s+x) (l+1) (x+1) And the point is that it is *reliable*. If you make your money day in, day out writing Haskell, and you don't want to rely on radical transformations for correctness, this is a sensible idiom to follow. Nothing beats understanding what you're writing at all levels of abstraction. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
Hello Andrew, Friday, May 16, 2008, 10:56:36 PM, you wrote: On the other hand, this is the anti-theisis of Haskell. We start with a high-level, declarative program, which performs horribly, and end up with a manually hand-optimised blob that's much harder to read but goes way faster. Obviously most people would prefer to write declarative code and feel secure that the compiler is going to produce something efficient. if i understood correctly, fusion system about which Don plan to told next time, is just about translating high-level code into lower-level one behind the scenes. but it works only on limited subset of programs. it's what we have now - haskell is very inefficient language -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
Andrew Coppin wrote: On the other hand, this is the anti-theisis of Haskell. We start with a high-level, declarative program, which performs horribly, and end up with a manually hand-optimised blob that's much harder to read but goes way faster. Buh? This is hard to read? mean n m = go 0 0 n where go s l x | x m = (s::Double) / fromIntegral (l::Int) | otherwise = go (s+x) (l+1) (x+1) One can in fact imagine a world in which the compiler does this transformation for you, though it takes a bit of squinting. http://reddit.com/r/programming/info/6jjhg/comments/c040ybt b ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
Bryan O'Sullivan wrote: Andrew Coppin wrote: On the other hand, this is the anti-theisis of Haskell. We start with a high-level, declarative program, which performs horribly, and end up with a manually hand-optimised blob that's much harder to read but goes way faster. Buh? This is hard to read? Look closer: it's hardER to read. mean xs = sum xs / fromIntegral (length xs) mean = go 0 0 n where go s l x | x m = s / fromIntegra l | otherwise = go (s+x) (l+1) (x+1 One version makes it instantly clear, at a glance, what is happening. The other requires you to mentally walk round a look, imperative style, to figure out what's happening. It's not a *big* deal, but it's unfortunate. I'm more worried about what happens in less trivial examples. [Let's face it, who wants to compute the sum of the numbers from 1 to N?] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
Andrew Coppin wrote: Bryan O'Sullivan wrote: Andrew Coppin wrote: On the other hand, this is the anti-theisis of Haskell. We start with a high-level, declarative program, which performs horribly, and end up with a manually hand-optimised blob that's much harder to read but goes way faster. Buh? This is hard to read? Look closer: it's hardER to read. mean xs = sum xs / fromIntegral (length xs) mean = go 0 0 n where go s l x | x m = s / fromIntegral l | otherwise = go (s+x) (l+1) (x+1 One version makes it instantly clear, at a glance, what is happening. The other requires you to mentally walk round a look, imperative style, to figure out what's happening. It's not a *big* deal, but it's unfortunate. I'm more worried about what happens in less trivial examples. [Let's face it, who wants to compute the sum of the numbers from 1 to N?] Hm, it seems like you're expecting magic, aren't you? Of course the first equation is easier to read, but it's no surprise that this may actually be slower. Like the imperative bubblesort is easier to read than the imperative quicksort but far slower. Put differently, making Haskell as fast as C is easy: just write a slower C program! Namely one that is as easy to read as the Haskell version. If you implement mean xs = sum xs / fromIntegral (length xs) directly in C, I bet you'll be delighted to discover that they perform similarly (using linked lists in the C version). Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
Don Stewart wrote: I don't understand what's ugly about: go s l x | x m = s / fromIntegral l | otherwise = go (s+x) (l+1) (x+1) And the point is that it is *reliable*. If you make your money day in, day out writing Haskell, and you don't want to rely on radical transformations for correctness, this is a sensible idiom to follow. Nothing beats understanding what you're writing at all levels of abstraction. What sets Haskell apart from every other programming language ever used in mainstream programming? You might say conciseness, or the ability to use lazy evaluation to structure your code in usual ways, or something like that. I would say what sets Haskell apart is abstraction. There are other things, but this is the big one. Haskell allows you to abstract almost everything. The result is often highly succinct yet very readable programs. It would seem a terribly shame if you always have to throw away Haskell's key advantage to get decent runtime performance. If you're trying to get a real program to work, right now, then yes, you may have no choice. But that doesn't mean we shouldn't strive for ways to keep code high-level yet performant. [I'm curios about your other comment. Does anybody, anywhere in the world, actually make *money* using Haskell? This seems rather unlikely to me...] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
apfelmus wrote: Andrew Coppin wrote: Look closer: it's hardER to read. mean xs = sum xs / fromIntegral (length xs) mean = go 0 0 n where go s l x | x m = s / fromIntegral l | otherwise = go (s+x) (l+1) (x+1 One version makes it instantly clear, at a glance, what is happening. The other requires you to mentally walk round a look, imperative style, to figure out what's happening. It's not a *big* deal, but it's unfortunate. I'm more worried about what happens in less trivial examples. [Let's face it, who wants to compute the sum of the numbers from 1 to N?] Hm, it seems like you're expecting magic, aren't you? Well, obviously it would be nice, wouldn't it? ;-) Of course the first equation is easier to read, but it's no surprise that this may actually be slower. Like the imperative bubblesort is easier to read than the imperative quicksort but far slower. I'm just saying, I prefer it when somebody posts some tiny snippet of Haskell that does the same thing as a 40-line C program, and then show how using some novel technique they just invented, the Haskell version actually outperforms C even though it's more reasable and more maintainable. Hey, who *wouldn't* like to have their cake and eat it too? :-) But yeah, I get the point. Everybody wants me to be quiet and go away. So I'll go be quiet now... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
Andrew Coppin [EMAIL PROTECTED] wrote: But yeah, I get the point. Everybody wants me to be quiet and go away. So I'll go be quiet now... Yes and no. Everybody wants you to be quiet and go to your study, writing a compiler that's Smart Enough(tm). We will let you out as soon as you're finished and supply you with pizza and crackers from time to time. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
Achim Schneider wrote: Andrew Coppin [EMAIL PROTECTED] wrote: But yeah, I get the point. Everybody wants me to be quiet and go away. So I'll go be quiet now... Yes and no. Everybody wants you to be quiet and go to your study, writing a compiler that's Smart Enough(tm). We will let you out as soon as you're finished and supply you with pizza and crackers from time to time. I... I think you just described my ideal place of employment! 0_0 It sure as hell beats the living daylights out of the nonesense I just spent 9-5 today dealing with. :-S ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
On Fri, 16 May 2008, Don Stewart wrote: I don't understand what's ugly about: go s l x | x m = s / fromIntegral l | otherwise = go (s+x) (l+1) (x+1) I suspect you've been looking at low-level code too long. How about the total lack of domain concepts? Try: mean n m = let (sum, length, _) = go (0,0,n) in sum / fromIntegral length where go :: (Double, Int, Double) - (Double, Int, Double) go t@(s,l,x) | x m = t | otherwise = go (s+x) (l+1) (x+1) as a starting point. I might consider generalising to a while HOF while I'm at it, because ultimately this is a while loop. Admittedly that would be relying on the implementation doing a little inlining, which you're not reliant on. Given the audience it'd be good to see some of the working to pull it off via fusion in a comment too: -- [1 .. d ] = unfoldr (let f n | n d = Nothing -- f n = Just (n,n+1) in f) 1 -- sum = foldr ... -- length = foldr ... -- sumAndLength = foldr ... (as calculated from the above) -- mean [1 .. d] = s / l where -- (sum, length) = sumAndLength [1 .. d] -- = sumAndLength . unfoldr ... -- = foldr ... . unfoldr ... -- = ... Some things it might be nice to have and/or know about: * We really ought to be able to build the sumAndLength fold by building the appropriate tuple and tuple function from its components - with there being a standard idiom for naming them, and a library of such things to hand. Same thing for go, too - this means we retain the domain concepts in its implementation by default. The interesting thing about go is that we ourselves running the guts of an unfold at the same time, which means it supplies our termination criteria - I suspect I ought to post a 'most general' form of go on that basis soon? * Does nesting unboxed tuples work appropriately? I was about to suggest a standard way of doing the wiring for the tupling as well, but so long as nesting unboxed tuples works and the optimiser 'gets it' then there's an arrow instance that ought to do nicely! While not quite as low-level, the resulting code should hopefully be easy bait for GHC's optimiser (if not, someone please yell!), while both retaining much of the domain info of the original code and conveying much about how it's made to go fast. Nothing beats understanding what you're writing at all levels of abstraction. How about ensuring that a casual reader can do the same quickly? -- [EMAIL PROTECTED] Performance anxiety leads to premature optimisation ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C. [Was: Re: GHC predictability]
On Fri, 16 May 2008, Andrew Coppin wrote: Obviously most people would prefer to write declarative code and feel secure that the compiler is going to produce something efficient. Ultimately the only way to do this is to stick to Einstein's advice - make things as simple as possible but no simpler. This means that if you care about speed then somewhere, the structure that enables a fast implementation needs to be declared so that the compiler can work with it. For example, you might not want to hand-fuse (I know I get bored of it pretty quickly) but the possibility of fusion will have to be clear. If you don't want to have to do it yourself (or don't know how!) and you want to be confident that something's going to run fast, that means a library covering a range of known cases that'll all go quickly. Don has been a major contributor here! But it's hard work, and if you don't understand how fast code is structured then ultimately you won't be able to predict - there'll never be a guarantee that lets you be completely ignorant. -- [EMAIL PROTECTED] 'In Ankh-Morpork even the shit have a street to itself... Truly this is a land of opportunity.' - Detritus, Men at Arms ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Andrew Coppin [EMAIL PROTECTED] writes: I'm more worried about what happens in less trivial examples. [Let's face it, who wants to compute the sum of the numbers from 1 to N?] Inspired by Don's blog post, and coincidentally working on a program where profiling points to one particular, short function as responsible for 60% of the work, I thought this would be a good time to look into core and reveal the deep secrets of my code. This is the function: mkAnn :: ByteString - Annotation mkAnn = pick . B.words where pick (_db:up:rest) = pick' up $ getGo rest pick' up' (go:_:ev:_) = Ann (B.copy up') (read $ B.unpack go) (read $ B.unpack ev) getGo = dropWhile (not . B.isPrefixOf (pack GO:)) A bit clunky, but simple enough: given a line of input, break into words, pick word number two, the word starting with GO: and the second-to-next word. Here are the data types involved: data Annotation = Ann !UniProtAcc !GoTerm !EvidenceCode deriving (Show) newtype GoTerm = GO Int deriving (Eq,Ord) type UniProtAcc = ByteString data EvidenceCode = ... -- many nullary constructors Unfortunately, this results in no less than four pages of core, with tons of less intelligible identfiers and nested cases and whatnot... any idea why this would be so slow? -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
ketil: Andrew Coppin [EMAIL PROTECTED] writes: I'm more worried about what happens in less trivial examples. [Let's face it, who wants to compute the sum of the numbers from 1 to N?] Inspired by Don's blog post, and coincidentally working on a program where profiling points to one particular, short function as responsible for 60% of the work, I thought this would be a good time to look into core and reveal the deep secrets of my code. This is the function: mkAnn :: ByteString - Annotation mkAnn = pick . B.words where pick (_db:up:rest) = pick' up $ getGo rest pick' up' (go:_:ev:_) = Ann (B.copy up') (read $ B.unpack go) (read $ B.unpack ev) getGo = dropWhile (not . B.isPrefixOf (pack GO:)) read $ B.unpack go Looks suspicious. You're unpacking to lists. ByteString performance rule 1: don't unpack to lists. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Ketil Malde wrote: mkAnn :: ByteString - Annotation mkAnn = pick . B.words where pick (_db:up:rest) = pick' up $ getGo rest pick' up' (go:_:ev:_) = Ann (B.copy up') (read $ B.unpack go) (read $ B.unpack ev) getGo = dropWhile (not . B.isPrefixOf (pack GO:)) It seems at first face miraculously coincidental that the dropWhile in the getGo definition knows to stop dropping when there are exactly 4 elements, in order to match the pattern in the second parameter of the pick' definition, whose argument is provided by (getGo Rest). What magic makes this true? Just curious... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Dan Weston wrote: Ketil Malde wrote: mkAnn :: ByteString - Annotation mkAnn = pick . B.words where pick (_db:up:rest) = pick' up $ getGo rest pick' up' (go:_:ev:_) = Ann (B.copy up') (read $ B.unpack go) (read $ B.unpack ev) getGo = dropWhile (not . B.isPrefixOf (pack GO:)) It seems at first face miraculously coincidental that the dropWhile in the getGo definition knows to stop dropping when there are exactly 4 elements, in order to match the pattern in the second parameter of the pick' definition, whose argument is provided by (getGo Rest). What magic makes this true? Just curious... I didn't mean exactly 4, but at least 3. Otherwise, I'm still curious! :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Write Haskell as fast as C.
Dan Weston [EMAIL PROTECTED] writes: mkAnn :: ByteString - Annotation mkAnn = pick . B.words where pick (_db:up:rest) = pick' up $ getGo rest pick' up' (go:_:ev:_) = Ann (B.copy up') (read $ B.unpack go) (read $ B.unpack ev) getGo = dropWhile (not . B.isPrefixOf (pack GO:)) It seems at first face miraculously coincidental that the dropWhile in the getGo definition knows to stop dropping when there are exactly 4 elements, in order to match the pattern in the second parameter of the pick' definition, whose argument is provided by (getGo Rest). What magic makes this true? Just curious... You want the long story? :-) This is for parsing the GOA file format, which contains links between proteins from the UniProt database to Gene Onthology (GO) terms. The format is not quite as regular as one would wish, but the second word is always the protein id, and whenever the GO term turns up, it is followed by something I forget (an InterPro reference perhaps) and then the evidence code - which I want. You feel happier now, I can tell. -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe