[Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Andrea Rossato
Hello!

I've been trying for quite some time to find an elegant solution to
cut long strings into lines, but the only solution I was able to come
up is the following piece of ugly code.

Is there a library function for that? What kind of approach would you
suggest?

Thanks for your kind attention.

Andrea

Here's the code:

-- does the actual job
wrapString str =  foldr addNL  $ rmFirstSpace $ concat $ splitS (getIndx $ 
indx str) str

-- gets the indexes of the spaces within a string 
indx = findIndices (\x - if x == ' ' then True else False)

-- gets the indexes of where to split the string into lines: lines
-- must be between 60 and 75 char long
getIndx :: [Int] - [Int]
getIndx = takeFirst . checkBound . (delete 0) . nub . map (\x - if  x  60   
x `rem` 60 = 0  x `rem` 70 = 10  then x else 0)

-- groups indexes when their distance is too short
checkBound = groupBy (\x y - if y - x  10 then True else False)

-- takes the first index of a group of indexes
takeFirst = map (\(x:xs) - x)

-- split a string given a list of indexes
splitS _ [] = []
splitS (x:xs) (ls) = [take x ls] : splitS (map (\i - i - x) xs) (drop x ls)
splitS _ ls = [ls]:[]

-- remove the first space from the begging of a string in a list of strings 
rmFirstSpace = map (\(x:xs) - if x == ' ' then xs else x:xs) 

-- used by foldr to fold the list of substrings 
addNL s s1 = s ++ \n ++ s1


try with putStrLn $ wrapString longString
where: 
longString = The Haskell XML Toolbox (HXT) is a collection of tools for 
processing XML with Haskell. The core component of the Haskell XML Toolbox is a 
domain specific language, consisting of a set of combinators, for processing 
XML trees in a simple and elegant way. The combinator library is based on the 
concept of arrows. The main component is a validating and namespace aware 
XML-Parser that supports almost fully the XML 1.0 Standard. Extensions are a 
validator for RelaxNG and an XPath evaluator.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Mark T.B. Carroll
I've been doing it as the enclosed. I wrote it a while ago, though, and
haven't really looked too hard at it since.

-- Mark
module WordWrap (wrap) where
import Data.Maybe

options :: String - [(String, String)]

options [] = [(, )]

options (x:xs) =
let rest = map (\(ys, zs) - (x:ys, zs)) (options xs)
 in if x == ' ' then (, xs) : rest else rest

bestSplit :: Int - String - (String, String)

bestSplit width string =
last (head wraps : takeWhile ((= width) . length . fst) (options string))

wrap :: Int - String - [String]

wrap _  = []

wrap width string =
let (x, ys) = bestSplit width string
 in x : wrap width ys
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Matthias Fischmann

On Sat, Sep 30, 2006 at 11:54:19AM -0400, Mark T.B. Carroll wrote:
 module WordWrap (wrap) where
 import Data.Maybe
 
 options :: String - [(String, String)]
 
 options [] = [(, )]
 
 options (x:xs) =
 let rest = map (\(ys, zs) - (x:ys, zs)) (options xs)
  in if x == ' ' then (, xs) : rest else rest
 
 bestSplit :: Int - String - (String, String)
 
 bestSplit width string =
 last (head wraps : takeWhile ((= width) . length . fst) (options string))

works better if you just skip the head wraps part.  (and now i am
curious: what was it supposed to mean?  how did it get there?)

 wrap :: Int - String - [String]
 
 wrap _  = []
 
 wrap width string =
 let (x, ys) = bestSplit width string
  in x : wrap width ys


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Bulat Ziganshin
Hello Andrea,

Saturday, September 30, 2006, 7:02:34 PM, you wrote:

 -- gets the indexes of the spaces within a string
 indx = findIndices (\x - if x == ' ' then True else False)

indx = findIndices (==' ')

 -- takes the first index of a group of indexes
 takeFirst = map (\(x:xs) - x)

takeFirst = map head

 -- split a string given a list of indexes
 splitS _ [] = []
 splitS (x:xs) (ls) = [take x ls] : splitS (map (\i - i - x) xs) (drop x ls)
 splitS _ ls = [ls]:[]

 -- remove the first space from the begging of a string in a list of strings
 rmFirstSpace = map (\(x:xs) - if x == ' ' then xs else x:xs)

i would prefer to use map rmFirstSpace where
rmFirstSpace (' ':xs) = xs
rmFirstSpace xs = xs

 -- used by foldr to fold the list of substrings 
 addNL s s1 = s ++ \n ++ s1

foldrl addNl == unlines ?


 try with putStrLn $ wrapString longString
 where: 
 longString = The Haskell XML Toolbox (HXT) is a collection of
 tools for processing XML with Haskell. The core component of the
 Haskell XML Toolbox is a domain specific language, consisting of a
 set of combinators, for processing XML trees in a simple and elegant
 way. The combinator library is based on the concept of arrows. The
 main component is a validating and namespace aware XML-Parser that
 supports almost fully the XML 1.0 Standard. Extensions are a
 validator for RelaxNG and an XPath evaluator.

i think that your algorithm is too complex. standard algorithm, imho,
is to find last space before 80 (or 75) chars margin, split here and
then repeat this procedure again. so, one line split may look like

splitAt . last . filter (80) . findIndices (==' ')

and then you need to define function which repeats this operation on
the rest of list. or, slightky different solution:

-- |this function splits the list xs into parts whose length defined
-- by call to function len_f on the rest of list
splitByLen len_f [] = []
splitByLen len_f xs = y : splitByLen len_f ys
   where (y,ys) = splitAt (len_f xs) xs

-- |this function finds last space in String within 80-char boundary
len_f = last . filter (80) . findIndices (==' ')

so, splitByLen len_f should give you that you need, you need only to
add checks for some additional conditions (first word in line is more
than 80 bytes long, it is a last line) and removing of the extra space
on each line

btw, are you seen http://haskell.org/haskellwiki/Simple_unix_tools ?



-- 
Best regards,
 Bulatmailto:[EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Matthias Fischmann

On Sat, Sep 30, 2006 at 04:36:02PM +0100, Neil Mitchell wrote:
 (if you can't be bothered to do that, the answer is lines ;)

although this wasn't the original problem, i like it, too :).  but now
i am stuck in finding an optimal implementation for lines.  the
following implementation is slightly slower than the built-in
function, and i suspect this to stem from the occurrance of reverse
for each line:

cut1 :: String - [String]
cut1 = f 
where
f x  = [reverse x]
f x ('\n':xs)  = reverse x : f  xs
f x (c:xs) = f (c:x) xs

i vaguely remember having seen a CPS trick here before, but all i can
come up with is the yet a little slower

cut2 :: String - [String]
cut2 = f id
where
f k  = [k ]
f k ('\n':xs)  = k  : f id xs
f k (c:xs) = f k' xs  where k' cs = k (c:cs)

also, i think both implementations are line-strict, that is, each line
is fully evaluated once touched in the first character.

is there a similar implementation, with CPS or not, that is lazy in
the lines and more efficient?


thanks,
matthias


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Andrea Rossato
On Sat, Sep 30, 2006 at 08:56:24PM +0400, Bulat Ziganshin wrote:
 i think that your algorithm is too complex. standard algorithm, imho,
 is to find last space before 80 (or 75) chars margin, split here and
 then repeat this procedure again. so, one line split may look like
 
 splitAt . last . filter (80) . findIndices (==' ')
...

Thank you very much for your analysis. I find it extremely helpful. 


 btw, are you seen http://haskell.org/haskellwiki/Simple_unix_tools ?

I did not! But I'm studying this page right now. Thanks for mentioning it.

Once again, thank you!
Andrea
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Andrea Rossato
On Sat, Sep 30, 2006 at 08:56:24PM +0400, Bulat Ziganshin wrote:
 splitByLen len_f [] = []
 splitByLen len_f xs = y : splitByLen len_f ys
where (y,ys) = splitAt (len_f xs) xs
...
 so, splitByLen len_f should give you that you need, you need only to
 add checks for some additional conditions (first word in line is more
 than 80 bytes long, it is a last line) and removing of the extra space
 on each line

I came up with this solution that seem to be fine, to me. I does the
checking of those additional conditions:

findSplitP at = last . filter (at) . findIndices (==' ')
where last [] = at
  last [x] = x
  last (_:xs) = last xs

wrapLS at [] = []
wrapLS at s = take ln s ++ \n ++ rest 
where ln = findSplitP at s
  remain = drop ln s
  rest = if length remain  at 
 then wrapLS at (tail remain) 
 else tail remain

then you can use lines/unlines to split it.

Thanks for your help.
Best regards,
Andrea
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Udo Stenzel
Matthias Fischmann wrote:
 although this wasn't the original problem, i like it, too :).  but now
 i am stuck in finding an optimal implementation for lines.

Isn't the obvious one good enough?

lines [] = []
lines s = go s
  where
go [] = [[]]
go ('\n':s) = [] : lines s
go (c:s) = let (l:ls) = go s in (c:l):ls


Udo.
-- 
Money can't buy friends, but it can get you a better class of enemy.
-- Spike Milligan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] cutting long strings into lines

2006-09-30 Thread Matthias Fischmann

On Sat, Sep 30, 2006 at 08:51:40PM +0200, Udo Stenzel wrote:
 To: Matthias Fischmann [EMAIL PROTECTED]
 Cc: haskell-cafe@haskell.org
 From: Udo Stenzel [EMAIL PROTECTED]
 Date: Sat, 30 Sep 2006 20:51:40 +0200
 Subject: Re: [Haskell-cafe] cutting long strings into lines
 
 Matthias Fischmann wrote:
  although this wasn't the original problem, i like it, too :).  but now
  i am stuck in finding an optimal implementation for lines.
 
 Isn't the obvious one good enough?
 
 lines [] = []
 lines s = go s
   where
 go [] = [[]]
 go ('\n':s) = [] : lines s
 go (c:s) = let (l:ls) = go s in (c:l):ls

thanks.  good enough, yes.  just not obvious to me...  (-:
matthias


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe