[Haskell-cafe] cutting long strings into lines
Hello! I've been trying for quite some time to find an elegant solution to cut long strings into lines, but the only solution I was able to come up is the following piece of ugly code. Is there a library function for that? What kind of approach would you suggest? Thanks for your kind attention. Andrea Here's the code: -- does the actual job wrapString str = foldr addNL $ rmFirstSpace $ concat $ splitS (getIndx $ indx str) str -- gets the indexes of the spaces within a string indx = findIndices (\x - if x == ' ' then True else False) -- gets the indexes of where to split the string into lines: lines -- must be between 60 and 75 char long getIndx :: [Int] - [Int] getIndx = takeFirst . checkBound . (delete 0) . nub . map (\x - if x 60 x `rem` 60 = 0 x `rem` 70 = 10 then x else 0) -- groups indexes when their distance is too short checkBound = groupBy (\x y - if y - x 10 then True else False) -- takes the first index of a group of indexes takeFirst = map (\(x:xs) - x) -- split a string given a list of indexes splitS _ [] = [] splitS (x:xs) (ls) = [take x ls] : splitS (map (\i - i - x) xs) (drop x ls) splitS _ ls = [ls]:[] -- remove the first space from the begging of a string in a list of strings rmFirstSpace = map (\(x:xs) - if x == ' ' then xs else x:xs) -- used by foldr to fold the list of substrings addNL s s1 = s ++ \n ++ s1 try with putStrLn $ wrapString longString where: longString = The Haskell XML Toolbox (HXT) is a collection of tools for processing XML with Haskell. The core component of the Haskell XML Toolbox is a domain specific language, consisting of a set of combinators, for processing XML trees in a simple and elegant way. The combinator library is based on the concept of arrows. The main component is a validating and namespace aware XML-Parser that supports almost fully the XML 1.0 Standard. Extensions are a validator for RelaxNG and an XPath evaluator. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] cutting long strings into lines
I've been doing it as the enclosed. I wrote it a while ago, though, and haven't really looked too hard at it since. -- Mark module WordWrap (wrap) where import Data.Maybe options :: String - [(String, String)] options [] = [(, )] options (x:xs) = let rest = map (\(ys, zs) - (x:ys, zs)) (options xs) in if x == ' ' then (, xs) : rest else rest bestSplit :: Int - String - (String, String) bestSplit width string = last (head wraps : takeWhile ((= width) . length . fst) (options string)) wrap :: Int - String - [String] wrap _ = [] wrap width string = let (x, ys) = bestSplit width string in x : wrap width ys ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] cutting long strings into lines
On Sat, Sep 30, 2006 at 11:54:19AM -0400, Mark T.B. Carroll wrote: module WordWrap (wrap) where import Data.Maybe options :: String - [(String, String)] options [] = [(, )] options (x:xs) = let rest = map (\(ys, zs) - (x:ys, zs)) (options xs) in if x == ' ' then (, xs) : rest else rest bestSplit :: Int - String - (String, String) bestSplit width string = last (head wraps : takeWhile ((= width) . length . fst) (options string)) works better if you just skip the head wraps part. (and now i am curious: what was it supposed to mean? how did it get there?) wrap :: Int - String - [String] wrap _ = [] wrap width string = let (x, ys) = bestSplit width string in x : wrap width ys signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] cutting long strings into lines
Hello Andrea, Saturday, September 30, 2006, 7:02:34 PM, you wrote: -- gets the indexes of the spaces within a string indx = findIndices (\x - if x == ' ' then True else False) indx = findIndices (==' ') -- takes the first index of a group of indexes takeFirst = map (\(x:xs) - x) takeFirst = map head -- split a string given a list of indexes splitS _ [] = [] splitS (x:xs) (ls) = [take x ls] : splitS (map (\i - i - x) xs) (drop x ls) splitS _ ls = [ls]:[] -- remove the first space from the begging of a string in a list of strings rmFirstSpace = map (\(x:xs) - if x == ' ' then xs else x:xs) i would prefer to use map rmFirstSpace where rmFirstSpace (' ':xs) = xs rmFirstSpace xs = xs -- used by foldr to fold the list of substrings addNL s s1 = s ++ \n ++ s1 foldrl addNl == unlines ? try with putStrLn $ wrapString longString where: longString = The Haskell XML Toolbox (HXT) is a collection of tools for processing XML with Haskell. The core component of the Haskell XML Toolbox is a domain specific language, consisting of a set of combinators, for processing XML trees in a simple and elegant way. The combinator library is based on the concept of arrows. The main component is a validating and namespace aware XML-Parser that supports almost fully the XML 1.0 Standard. Extensions are a validator for RelaxNG and an XPath evaluator. i think that your algorithm is too complex. standard algorithm, imho, is to find last space before 80 (or 75) chars margin, split here and then repeat this procedure again. so, one line split may look like splitAt . last . filter (80) . findIndices (==' ') and then you need to define function which repeats this operation on the rest of list. or, slightky different solution: -- |this function splits the list xs into parts whose length defined -- by call to function len_f on the rest of list splitByLen len_f [] = [] splitByLen len_f xs = y : splitByLen len_f ys where (y,ys) = splitAt (len_f xs) xs -- |this function finds last space in String within 80-char boundary len_f = last . filter (80) . findIndices (==' ') so, splitByLen len_f should give you that you need, you need only to add checks for some additional conditions (first word in line is more than 80 bytes long, it is a last line) and removing of the extra space on each line btw, are you seen http://haskell.org/haskellwiki/Simple_unix_tools ? -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] cutting long strings into lines
On Sat, Sep 30, 2006 at 04:36:02PM +0100, Neil Mitchell wrote: (if you can't be bothered to do that, the answer is lines ;) although this wasn't the original problem, i like it, too :). but now i am stuck in finding an optimal implementation for lines. the following implementation is slightly slower than the built-in function, and i suspect this to stem from the occurrance of reverse for each line: cut1 :: String - [String] cut1 = f where f x = [reverse x] f x ('\n':xs) = reverse x : f xs f x (c:xs) = f (c:x) xs i vaguely remember having seen a CPS trick here before, but all i can come up with is the yet a little slower cut2 :: String - [String] cut2 = f id where f k = [k ] f k ('\n':xs) = k : f id xs f k (c:xs) = f k' xs where k' cs = k (c:cs) also, i think both implementations are line-strict, that is, each line is fully evaluated once touched in the first character. is there a similar implementation, with CPS or not, that is lazy in the lines and more efficient? thanks, matthias signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] cutting long strings into lines
On Sat, Sep 30, 2006 at 08:56:24PM +0400, Bulat Ziganshin wrote: i think that your algorithm is too complex. standard algorithm, imho, is to find last space before 80 (or 75) chars margin, split here and then repeat this procedure again. so, one line split may look like splitAt . last . filter (80) . findIndices (==' ') ... Thank you very much for your analysis. I find it extremely helpful. btw, are you seen http://haskell.org/haskellwiki/Simple_unix_tools ? I did not! But I'm studying this page right now. Thanks for mentioning it. Once again, thank you! Andrea ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] cutting long strings into lines
On Sat, Sep 30, 2006 at 08:56:24PM +0400, Bulat Ziganshin wrote: splitByLen len_f [] = [] splitByLen len_f xs = y : splitByLen len_f ys where (y,ys) = splitAt (len_f xs) xs ... so, splitByLen len_f should give you that you need, you need only to add checks for some additional conditions (first word in line is more than 80 bytes long, it is a last line) and removing of the extra space on each line I came up with this solution that seem to be fine, to me. I does the checking of those additional conditions: findSplitP at = last . filter (at) . findIndices (==' ') where last [] = at last [x] = x last (_:xs) = last xs wrapLS at [] = [] wrapLS at s = take ln s ++ \n ++ rest where ln = findSplitP at s remain = drop ln s rest = if length remain at then wrapLS at (tail remain) else tail remain then you can use lines/unlines to split it. Thanks for your help. Best regards, Andrea ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] cutting long strings into lines
Matthias Fischmann wrote: although this wasn't the original problem, i like it, too :). but now i am stuck in finding an optimal implementation for lines. Isn't the obvious one good enough? lines [] = [] lines s = go s where go [] = [[]] go ('\n':s) = [] : lines s go (c:s) = let (l:ls) = go s in (c:l):ls Udo. -- Money can't buy friends, but it can get you a better class of enemy. -- Spike Milligan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] cutting long strings into lines
On Sat, Sep 30, 2006 at 08:51:40PM +0200, Udo Stenzel wrote: To: Matthias Fischmann [EMAIL PROTECTED] Cc: haskell-cafe@haskell.org From: Udo Stenzel [EMAIL PROTECTED] Date: Sat, 30 Sep 2006 20:51:40 +0200 Subject: Re: [Haskell-cafe] cutting long strings into lines Matthias Fischmann wrote: although this wasn't the original problem, i like it, too :). but now i am stuck in finding an optimal implementation for lines. Isn't the obvious one good enough? lines [] = [] lines s = go s where go [] = [[]] go ('\n':s) = [] : lines s go (c:s) = let (l:ls) = go s in (c:l):ls thanks. good enough, yes. just not obvious to me... (-: matthias signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe