[Haskell-cafe] Re: Software Tools in Haskell

2007-12-15 Thread apfelmus

Benja Fallenstein wrote:

Henning Thielemann wrote:

I remember there was a discussion about how to implement full 'wc' in an
elegant but maximally lazy form, that is counting bytes, words and lines
in one go. Did someone have a nice idea of how to compose the three
counters from implementations of each counter? I'm afraid one cannot
simply use the split and count fragments trick then.


Well, you could rely on catamorphism fusion

  (foldr f1 x1, foldr f2 x2) = foldr (f1 *** f2) (x1,x2)

but that's not so compositional.


Could you turn the folds into scans and use zip3 and last? I.e.,
something like this:


This approach is really clever!


data Triple a b c = Triple !a !b !c deriving Show

countChars :: String - [Int]
countChars = scanl (\n _ - n+1) 0

countChar :: Char - String - [Int]
countChar c = scanl (\n c' - if c == c' then n+1 else n) 0

countLines = countChar '\n'
countWords = countChar ' '

last' [x] = x
last' (x:xs) = x `seq` last' xs

zip3' (x:xs) (y:ys) (z:zs) = Triple x y z : zip3' xs ys zs
zip3' _ _ _ = []


  zipWith3 Triple


wc :: String - Triple Int Int Int
wc xs = last' $ zip3' (countChars xs) (countWords xs) (countLines xs)

main = print . wc = getContents

(or use Data.Strict.Tuple -- but that only has pairs and no zip...)


Slightly simplified (uses BangPatterns):

  import Data.List

  scanl' :: (b - a - b) - b - [a] - [a]
  scanl' f !b [] = [b]
  scanl' f !b (x:xs) = b:scanl' (f b x) xs

  counts :: (a - Bool) - [a] - [Int]
  counts p = scanl' (\n c - if p c then n+1 else n) 0

  wc :: String - (Int,Int,Int)
  wc = last $ zip3 (charc xs) (wordc xs) (linec xs)
 where
 charc = counts (const True)
 wordc = counts (== ' ')
 linec = counts (== '\n')

The  scanl'  basically ensures that the forcing the resulting list spine 
automatically forces the elements. This makes sense to do early and we 
can use normal list functions after that.



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Software Tools in Haskell

2007-12-13 Thread apfelmus

Tommy M McGuire wrote:

apfelmus wrote:


  tabwidth = 4

 -- tabstop !! (col-1) == there is a tabstop at column  col
 -- This is an infinite list, so no need to limit the line width
  tabstops  = map (\col - col `mod` tabwidth == 1) [1..]

 -- calculate spaces needed to fill to the next tabstop in advance
  tabspaces = snd $ mapAccumR addspace [] tabstops
  addspace cs isstop = let cs'=' ':cs in (if isstop then [] else cs',cs')


Are you using mapAccumR (mapAccumR? (!)) to share space among the space 
strings?


Sharing is a good idea! But  mapAccumR  has nothing to do with it, I 
just used it to encode the recursion, as replacement for a  fold  so to 
speak.



 If so, wouldn't this be better:

tabstops = map (\col - col `mod` tabwidth == 1) [1..tabwidth]
tabspaces = cycle $ snd $ mapAccumR addspace [] tabstops


Yes. We can make the code even simpler :)

  tabspaces = cycle . init . tails . replicate tabwidth $ ' '

and the  tabstops  list is gone.


On the other hand, wouldn't this make for less head scratching:

tabspaces = map (\col - replicate (spacesFor col) ' ') [1..]
  where
  spacesFor col = tabwidth - ((col - 1) `mod` tabwidth)


Yes and no. The very idea of introducing the  tabspaces  list in the 
first place is to avoid explicit indices altogether, a single  zipWith 
is responsible for aligning columns. So, it's only natural to avoid 
indices for the definition of  tabspaces , too.


A side effect of separating  tabspaces  from the main loop is that we 
can do all kind of irregular tabstop spacing or different fill 
characters and the like solely by changing this list.



  main = interact $ unlines . map detabLine . lines
 where
 detabLine = concat $ zipWith replace tabspaces


I think you mean concat . zipWith   (You're doing this from 
memory, aren't you?)


Yes and yes :)


 replace cs '\t' = cs -- replace with adequate number of spaces
 replace _  char = [char] -- pass through


How about that?


It doesn't produce the same output, [...]
It's counting tabs before expanding rather than after?


Yes, I noticed it too late, it's so wrong (_) :)

Here's a correct version:

  perLine f = interact $ unlines . map f . lines

  main = perLine (detabLine tabspaces)
 where
 detabLine _  []= []
 detabLine (w:ws) ('\t':cs) = detabLine (w:ws) (w ++ cs)
 detabLine (w:ws) (c   :cs) = c:detabLine ws cs

Or even

  main = interact $ detab tabspaces
 where
 detab _  []= []
 detab _  ('\n':cs) = '\n':detab tabspaces cs
 detab (w:ws) ('\t':cs) =  detab (w:ws) (w ++ cs)
 detab (_:ws) (c   :cs) =c:detab ws cs

This can't be expressed with  zip  anymore since the alignment of the 
list of spaces and the text changes when encountering a tab.



@dons: I guess that  detab  would probably be a very interesting (and 
even useful) study example for generalizing stream fusion, since it's 
more like  concatMap  than  map .



Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Software Tools in Haskell

2007-12-12 Thread apfelmus

Tommy M McGuire wrote:

(Plus, interact is scary. :-D )


You have a scary feeling for a moment, then it passes. ;)


Gwern Branwen wrote:

I... I want to provide a one-liner for 'detab', but it looks
impressively monstrous and I'm not sure I understand it.


On the other hand, I'm not looking for one-liners; I really want clarity 
as opposed to cleverness.


  tabwidth = 4

 -- tabstop !! (col-1) == there is a tabstop at column  col
 -- This is an infinite list, so no need to limit the line width
  tabstops  = map (\col - col `mod` tabwidth == 1) [1..]

 -- calculate spaces needed to fill to the next tabstop in advance
  tabspaces = snd $ mapAccumR addspace [] tabstops
  addspace cs isstop = let cs'=' ':cs in (if isstop then [] else cs',cs')


  main = interact $ unlines . map detabLine . lines
 where
 detabLine = concat $ zipWith replace tabspaces
 replace cs '\t' = cs -- replace with adequate number of spaces
 replace _  char = [char] -- pass through


How about that?


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Software Tools in Haskell

2007-12-12 Thread Tommy M McGuire

apfelmus wrote:

Tommy M McGuire wrote:

(Plus, interact is scary. :-D )


You have a scary feeling for a moment, then it passes. ;)


  tabwidth = 4

 -- tabstop !! (col-1) == there is a tabstop at column  col
 -- This is an infinite list, so no need to limit the line width
  tabstops  = map (\col - col `mod` tabwidth == 1) [1..]

 -- calculate spaces needed to fill to the next tabstop in advance
  tabspaces = snd $ mapAccumR addspace [] tabstops
  addspace cs isstop = let cs'=' ':cs in (if isstop then [] else cs',cs')


Are you using mapAccumR (mapAccumR? (!)) to share space among the space 
strings?  If so, wouldn't this be better:


tabstops = map (\col - col `mod` tabwidth == 1) [1..tabwidth]
tabspaces = cycle $ snd $ mapAccumR addspace [] tabstops

On the other hand, wouldn't this make for less head scratching:

tabspaces = map (\col - replicate (spacesFor col) ' ') [1..]
  where
  spacesFor col = tabwidth - ((col - 1) `mod` tabwidth)


  main = interact $ unlines . map detabLine . lines
 where
 detabLine = concat $ zipWith replace tabspaces


I think you mean concat . zipWith   (You're doing this from 
memory, aren't you?)



 replace cs '\t' = cs -- replace with adequate number of spaces
 replace _  char = [char] -- pass through


How about that?


It doesn't produce the same output, although I almost like it enough not 
to care:


$ od -a test
000  ht   c   o   l  sp   1  ht   2  ht   3   4  ht   r   e   s   t
020  nl
021
$ runhaskell detab.hs test
col 1   2   34  rest
$ runhaskell detab2.hs test
col 1  234 rest

It's counting tabs before expanding rather than after?

--
Tommy M. McGuire
[EMAIL PROTECTED]
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Software Tools in Haskell

2007-12-12 Thread Conal Elliott
Since there are many useful per-line functions, do a little refactoring,
placing the following into a library:

  perLine :: (String - String) - (String - String)
  perLine f = unlines . map f . lines


On Dec 12, 2007 12:43 PM, apfelmus [EMAIL PROTECTED] wrote:

 Tommy M McGuire wrote:
  (Plus, interact is scary. :-D )

 You have a scary feeling for a moment, then it passes. ;)

  Gwern Branwen wrote:
  I... I want to provide a one-liner for 'detab', but it looks
  impressively monstrous and I'm not sure I understand it.
 
  On the other hand, I'm not looking for one-liners; I really want clarity
  as opposed to cleverness.

   tabwidth = 4

  -- tabstop !! (col-1) == there is a tabstop at column  col
  -- This is an infinite list, so no need to limit the line width
   tabstops  = map (\col - col `mod` tabwidth == 1) [1..]

  -- calculate spaces needed to fill to the next tabstop in advance
   tabspaces = snd $ mapAccumR addspace [] tabstops
   addspace cs isstop = let cs'=' ':cs in (if isstop then [] else cs',cs')


   main = interact $ unlines . map detabLine . lines
  where
  detabLine = concat $ zipWith replace tabspaces
  replace cs '\t' = cs -- replace with adequate number of spaces
  replace _  char = [char] -- pass through


 How about that?


 Regards,
 apfelmus

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe