[Haskell-cafe] Re: Software Tools in Haskell
Benja Fallenstein wrote: Henning Thielemann wrote: I remember there was a discussion about how to implement full 'wc' in an elegant but maximally lazy form, that is counting bytes, words and lines in one go. Did someone have a nice idea of how to compose the three counters from implementations of each counter? I'm afraid one cannot simply use the split and count fragments trick then. Well, you could rely on catamorphism fusion (foldr f1 x1, foldr f2 x2) = foldr (f1 *** f2) (x1,x2) but that's not so compositional. Could you turn the folds into scans and use zip3 and last? I.e., something like this: This approach is really clever! data Triple a b c = Triple !a !b !c deriving Show countChars :: String - [Int] countChars = scanl (\n _ - n+1) 0 countChar :: Char - String - [Int] countChar c = scanl (\n c' - if c == c' then n+1 else n) 0 countLines = countChar '\n' countWords = countChar ' ' last' [x] = x last' (x:xs) = x `seq` last' xs zip3' (x:xs) (y:ys) (z:zs) = Triple x y z : zip3' xs ys zs zip3' _ _ _ = [] zipWith3 Triple wc :: String - Triple Int Int Int wc xs = last' $ zip3' (countChars xs) (countWords xs) (countLines xs) main = print . wc = getContents (or use Data.Strict.Tuple -- but that only has pairs and no zip...) Slightly simplified (uses BangPatterns): import Data.List scanl' :: (b - a - b) - b - [a] - [a] scanl' f !b [] = [b] scanl' f !b (x:xs) = b:scanl' (f b x) xs counts :: (a - Bool) - [a] - [Int] counts p = scanl' (\n c - if p c then n+1 else n) 0 wc :: String - (Int,Int,Int) wc = last $ zip3 (charc xs) (wordc xs) (linec xs) where charc = counts (const True) wordc = counts (== ' ') linec = counts (== '\n') The scanl' basically ensures that the forcing the resulting list spine automatically forces the elements. This makes sense to do early and we can use normal list functions after that. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Software Tools in Haskell
Tommy M McGuire wrote: apfelmus wrote: tabwidth = 4 -- tabstop !! (col-1) == there is a tabstop at column col -- This is an infinite list, so no need to limit the line width tabstops = map (\col - col `mod` tabwidth == 1) [1..] -- calculate spaces needed to fill to the next tabstop in advance tabspaces = snd $ mapAccumR addspace [] tabstops addspace cs isstop = let cs'=' ':cs in (if isstop then [] else cs',cs') Are you using mapAccumR (mapAccumR? (!)) to share space among the space strings? Sharing is a good idea! But mapAccumR has nothing to do with it, I just used it to encode the recursion, as replacement for a fold so to speak. If so, wouldn't this be better: tabstops = map (\col - col `mod` tabwidth == 1) [1..tabwidth] tabspaces = cycle $ snd $ mapAccumR addspace [] tabstops Yes. We can make the code even simpler :) tabspaces = cycle . init . tails . replicate tabwidth $ ' ' and the tabstops list is gone. On the other hand, wouldn't this make for less head scratching: tabspaces = map (\col - replicate (spacesFor col) ' ') [1..] where spacesFor col = tabwidth - ((col - 1) `mod` tabwidth) Yes and no. The very idea of introducing the tabspaces list in the first place is to avoid explicit indices altogether, a single zipWith is responsible for aligning columns. So, it's only natural to avoid indices for the definition of tabspaces , too. A side effect of separating tabspaces from the main loop is that we can do all kind of irregular tabstop spacing or different fill characters and the like solely by changing this list. main = interact $ unlines . map detabLine . lines where detabLine = concat $ zipWith replace tabspaces I think you mean concat . zipWith (You're doing this from memory, aren't you?) Yes and yes :) replace cs '\t' = cs -- replace with adequate number of spaces replace _ char = [char] -- pass through How about that? It doesn't produce the same output, [...] It's counting tabs before expanding rather than after? Yes, I noticed it too late, it's so wrong (_) :) Here's a correct version: perLine f = interact $ unlines . map f . lines main = perLine (detabLine tabspaces) where detabLine _ []= [] detabLine (w:ws) ('\t':cs) = detabLine (w:ws) (w ++ cs) detabLine (w:ws) (c :cs) = c:detabLine ws cs Or even main = interact $ detab tabspaces where detab _ []= [] detab _ ('\n':cs) = '\n':detab tabspaces cs detab (w:ws) ('\t':cs) = detab (w:ws) (w ++ cs) detab (_:ws) (c :cs) =c:detab ws cs This can't be expressed with zip anymore since the alignment of the list of spaces and the text changes when encountering a tab. @dons: I guess that detab would probably be a very interesting (and even useful) study example for generalizing stream fusion, since it's more like concatMap than map . Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Software Tools in Haskell
Tommy M McGuire wrote: (Plus, interact is scary. :-D ) You have a scary feeling for a moment, then it passes. ;) Gwern Branwen wrote: I... I want to provide a one-liner for 'detab', but it looks impressively monstrous and I'm not sure I understand it. On the other hand, I'm not looking for one-liners; I really want clarity as opposed to cleverness. tabwidth = 4 -- tabstop !! (col-1) == there is a tabstop at column col -- This is an infinite list, so no need to limit the line width tabstops = map (\col - col `mod` tabwidth == 1) [1..] -- calculate spaces needed to fill to the next tabstop in advance tabspaces = snd $ mapAccumR addspace [] tabstops addspace cs isstop = let cs'=' ':cs in (if isstop then [] else cs',cs') main = interact $ unlines . map detabLine . lines where detabLine = concat $ zipWith replace tabspaces replace cs '\t' = cs -- replace with adequate number of spaces replace _ char = [char] -- pass through How about that? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Software Tools in Haskell
apfelmus wrote: Tommy M McGuire wrote: (Plus, interact is scary. :-D ) You have a scary feeling for a moment, then it passes. ;) tabwidth = 4 -- tabstop !! (col-1) == there is a tabstop at column col -- This is an infinite list, so no need to limit the line width tabstops = map (\col - col `mod` tabwidth == 1) [1..] -- calculate spaces needed to fill to the next tabstop in advance tabspaces = snd $ mapAccumR addspace [] tabstops addspace cs isstop = let cs'=' ':cs in (if isstop then [] else cs',cs') Are you using mapAccumR (mapAccumR? (!)) to share space among the space strings? If so, wouldn't this be better: tabstops = map (\col - col `mod` tabwidth == 1) [1..tabwidth] tabspaces = cycle $ snd $ mapAccumR addspace [] tabstops On the other hand, wouldn't this make for less head scratching: tabspaces = map (\col - replicate (spacesFor col) ' ') [1..] where spacesFor col = tabwidth - ((col - 1) `mod` tabwidth) main = interact $ unlines . map detabLine . lines where detabLine = concat $ zipWith replace tabspaces I think you mean concat . zipWith (You're doing this from memory, aren't you?) replace cs '\t' = cs -- replace with adequate number of spaces replace _ char = [char] -- pass through How about that? It doesn't produce the same output, although I almost like it enough not to care: $ od -a test 000 ht c o l sp 1 ht 2 ht 3 4 ht r e s t 020 nl 021 $ runhaskell detab.hs test col 1 2 34 rest $ runhaskell detab2.hs test col 1 234 rest It's counting tabs before expanding rather than after? -- Tommy M. McGuire [EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Software Tools in Haskell
Since there are many useful per-line functions, do a little refactoring, placing the following into a library: perLine :: (String - String) - (String - String) perLine f = unlines . map f . lines On Dec 12, 2007 12:43 PM, apfelmus [EMAIL PROTECTED] wrote: Tommy M McGuire wrote: (Plus, interact is scary. :-D ) You have a scary feeling for a moment, then it passes. ;) Gwern Branwen wrote: I... I want to provide a one-liner for 'detab', but it looks impressively monstrous and I'm not sure I understand it. On the other hand, I'm not looking for one-liners; I really want clarity as opposed to cleverness. tabwidth = 4 -- tabstop !! (col-1) == there is a tabstop at column col -- This is an infinite list, so no need to limit the line width tabstops = map (\col - col `mod` tabwidth == 1) [1..] -- calculate spaces needed to fill to the next tabstop in advance tabspaces = snd $ mapAccumR addspace [] tabstops addspace cs isstop = let cs'=' ':cs in (if isstop then [] else cs',cs') main = interact $ unlines . map detabLine . lines where detabLine = concat $ zipWith replace tabspaces replace cs '\t' = cs -- replace with adequate number of spaces replace _ char = [char] -- pass through How about that? Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe