Re: [Haskell-cafe] simple parsec question
Carlo, Thanks a lot! This looks very promising (though I have to test it for my purpose more in depth). As you mention, the key seems to be the optionMaybe combinator. Thanks for pointing to it. Immanuel 2013/3/5 Carlo Hamalainen carlo.hamalai...@gmail.com On Mon, Mar 4, 2013 at 1:44 AM, Immanuel Normann immanuel.norm...@googlemail.com wrote: I am trying to parse a semi structured text with parsec that basically should identify sections. Each section starts with a headline and has an unstructured content - that's all. Here's my attempt: https://gist.github.com/carlohamalainen/5087207 {-# LANGUAGE FlexibleContexts #-} import Text.Parsec import Control.Applicative hiding ((|),many) -- Example input: {- top 1: some text ... bla top 2: more text ... bla bla -} data Top = Top String deriving (Show) data Content = Content [String] deriving (Show) data Section = Section Top Content deriving (Show) headline = do t - many1 (noneOf :\n) char ':' newline return $ Top t contentLine = do x - many (noneOf :\n) newline return x content = do line - optionMaybe (try contentLine) case line of Just x - do xs - content return (x:xs) _ - return [] section = do h - headline c - Content $ content return $ Section h c main = do x - readFile simple.txt print $ parse (many section) x Example run using your sample data: $ runhaskell Simple.hs Right [Section (Top top 1) (Content [,some text ... bla,]),Section (Top top 2) (Content [,more text ... bla bla,])] Notes: * I had to assume that a content line does not contain a ':', because that is the only way to distinguish a head-line (correct me if I'm wrong). * The key was to use optionMaybe along with try; see the definition of content. * I haven't tested this code on very large inputs. * I slightly changed the definition of Content to have a list of Strings, one for each line. I'm sure this could be altered if you wanted to retain all whitespace. * I am still new to Parsec, so don't take this as the definitive answer ;-) -- Carlo Hamalainen http://carlo-hamalainen.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] simple parsec question
Maybe this is something you do not even want to use a parser combinator library for. The package http://hackage.haskell.org/packages/archive/list-grouping/0.1.1/doc/html/Data-List-Grouping.html contains a function breakBefore, so you can write main = do inp - readFile ... let result = map mkSection . breakBefore ((= ':').last)). lines $ inp mkSection (l:ll) = Section (Top l) (Contents ll) Doaitse On Mar 3, 2013, at 16:44 , Immanuel Normann immanuel.norm...@googlemail.com wrote: Hi, I am trying to parse a semi structured text with parsec that basically should identify sections. Each section starts with a headline and has an unstructured content - that's all. For instance, consider the following example text (inside the dashed lines): --- top 1: some text ... bla top 2: more text ... bla bla --- This should be parsed into a structure like this: [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content more text ... bla)] Say, I have a parser headline, but the content after a headline could be anything that is different from what headline parses. How could the section parser making use of headline look like? My idea would be to use the manyTill combinator, but I dont find an easy solution. Many thanks for any hint Immanuel ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] simple parsec question
Andrey, Thanks a lot for your effort! I have the same suspect that the lookahead in the content parser is the problem, but I don't know how to solve it either. At least the I learned from your code that noneOf is also a quite useful parser in this context which I have overlooked. Anyway, if you find a solution it would be great! In the end the task itself doesn't look very specific, but rather general: an alternation between strictly (the headline in my case) and loosely (the content in my case) structured text. It shouldn't be difficult to build a parser for such a setting. (btw. I am aware the my test parser would (or rather should) parse only the first section. For testing this would be sufficient.) 2013/3/4 Andrey Chudnov achud...@gmail.com Immanuel, I tried but I couldn't figure it out. Here's a gist with my attempts and results so far: https://gist.github.com/achudnov/f3af65f11d5162c73064There, 'test' uses my attempt at specifying the parser, 'test2' uses yours. Note that your attempt wouldn't parse multiple sections -- for that you need to use 'many section' instead of just 'section' in 'parse' ('parseFromFile' in the original). I think what's going on is the lookahead is wrong, but I'm not sure how exactly. I'll give it another go tomorrow if I have time. /Andrey On 03/03/2013 05:16 PM, Immanuel Normann wrote: Andrey, Thanks for your attempt, but it doesn't seem to work. The easy part is the headline, but the content makes trouble. Let me write the code a bit more explicit, so you can copy and paste it: -- {-# LANGUAGE FlexibleContexts #-} module Main where import Text.Parsec data Top = Top String deriving (Show) data Content = Content String deriving (Show) data Section = Section Top Content deriving (Show) headline :: Stream s m Char = ParsecT s u m Top headline = manyTill anyChar (char ':' newline) = return . Top content :: Stream s m Char = ParsecT s u m Content content = manyTill anyChar (try headline) = return . Content section :: Stream s m Char = ParsecT s u m Section section = do {h - headline; c - content; return (Section h c)} -- Assume the following example text is stored in /tmp/test.txt: --- top 1: some text ... bla top 2: more text ... bla bla --- Now I run the section parser in ghci against the above mentioned example text stored in /tmp/test.txt: *Main parseFromFile section /tmp/test.txt Right (Section (Top top 1) (Content )) I don't understand the behaviour of the content parser here. Why does it return ? Or perhaps more generally, I don't understand the manyTill combinator (though I read the docs). Side remark: of cause for this little task it is probably to much effort to use parsec. However, my content in fact has an internal structure which I would like to parse further, but I deliberately abstracted from these internals as they don't effect my above stated problem. Immanuel 2013/3/3 Andrey Chudnov achud...@gmail.com Immanuel, Since a heading always starts with a new line (and ends with a colon followed by a carriage return or just a colon?), I think it might be useful to first separate the input into lines and then classify them depending on whether it's a heading or not and reassemble them into the value you need. You don't even need parsec for that. However, if you really want to use parsec, you can write something like (warning, not tested): many $ liftM2 Section headline content where headline = anyChar `manyTill` (char ':' spaces newline) content = anyChar `manyTill` (try $ newline headline) /Andrey On 3/3/2013 10:44 AM, Immanuel Normann wrote: I am trying to parse a semi structured text with parsec that basically should identify sections. Each section starts with a headline and has an unstructured content - that's all. For instance, consider the following example text (inside the dashed lines): --- top 1: some text ... bla top 2: more text ... bla bla --- This should be parsed into a structure like this: [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content more text ... bla)] Say, I have a parser headline, but the content after a headline could be anything that is different from what headline parses. How could the section parser making use of headline look like? My idea would be to use the manyTill combinator, but I dont find an easy solution. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] simple parsec question
On Mon, Mar 4, 2013 at 1:44 AM, Immanuel Normann immanuel.norm...@googlemail.com wrote: I am trying to parse a semi structured text with parsec that basically should identify sections. Each section starts with a headline and has an unstructured content - that's all. Here's my attempt: https://gist.github.com/carlohamalainen/5087207 {-# LANGUAGE FlexibleContexts #-} import Text.Parsec import Control.Applicative hiding ((|),many) -- Example input: {- top 1: some text ... bla top 2: more text ... bla bla -} data Top = Top String deriving (Show) data Content = Content [String] deriving (Show) data Section = Section Top Content deriving (Show) headline = do t - many1 (noneOf :\n) char ':' newline return $ Top t contentLine = do x - many (noneOf :\n) newline return x content = do line - optionMaybe (try contentLine) case line of Just x - do xs - content return (x:xs) _ - return [] section = do h - headline c - Content $ content return $ Section h c main = do x - readFile simple.txt print $ parse (many section) x Example run using your sample data: $ runhaskell Simple.hs Right [Section (Top top 1) (Content [,some text ... bla,]),Section (Top top 2) (Content [,more text ... bla bla,])] Notes: * I had to assume that a content line does not contain a ':', because that is the only way to distinguish a head-line (correct me if I'm wrong). * The key was to use optionMaybe along with try; see the definition of content. * I haven't tested this code on very large inputs. * I slightly changed the definition of Content to have a list of Strings, one for each line. I'm sure this could be altered if you wanted to retain all whitespace. * I am still new to Parsec, so don't take this as the definitive answer ;-) -- Carlo Hamalainen http://carlo-hamalainen.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] simple parsec question
Hi, I am trying to parse a semi structured text with parsec that basically should identify sections. Each section starts with a headline and has an unstructured content - that's all. For instance, consider the following example text (inside the dashed lines): --- top 1: some text ... bla top 2: more text ... bla bla --- This should be parsed into a structure like this: [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content more text ... bla)] Say, I have a parser headline, but the content after a headline could be anything that is different from what headline parses. How could the section parser making use of headline look like? My idea would be to use the manyTill combinator, but I dont find an easy solution. Many thanks for any hint Immanuel ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] simple parsec question
Immanuel, Since a heading always starts with a new line (and ends with a colon followed by a carriage return or just a colon?), I think it might be useful to first separate the input into lines and then classify them depending on whether it's a heading or not and reassemble them into the value you need. You don't even need parsec for that. However, if you really want to use parsec, you can write something like (warning, not tested): many $ liftM2 Section headline content where headline = anyChar `manyTill` (char ':' spaces newline) content = anyChar `manyTill` (try $ newline headline) /Andrey On 3/3/2013 10:44 AM, Immanuel Normann wrote: I am trying to parse a semi structured text with parsec that basically should identify sections. Each section starts with a headline and has an unstructured content - that's all. For instance, consider the following example text (inside the dashed lines): --- top 1: some text ... bla top 2: more text ... bla bla --- This should be parsed into a structure like this: [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content more text ... bla)] Say, I have a parser headline, but the content after a headline could be anything that is different from what headline parses. How could the section parser making use of headline look like? My idea would be to use the manyTill combinator, but I dont find an easy solution. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] simple parsec question
Andrey, Thanks for your attempt, but it doesn't seem to work. The easy part is the headline, but the content makes trouble. Let me write the code a bit more explicit, so you can copy and paste it: -- {-# LANGUAGE FlexibleContexts #-} module Main where import Text.Parsec data Top = Top String deriving (Show) data Content = Content String deriving (Show) data Section = Section Top Content deriving (Show) headline :: Stream s m Char = ParsecT s u m Top headline = manyTill anyChar (char ':' newline) = return . Top content :: Stream s m Char = ParsecT s u m Content content = manyTill anyChar (try headline) = return . Content section :: Stream s m Char = ParsecT s u m Section section = do {h - headline; c - content; return (Section h c)} -- Assume the following example text is stored in /tmp/test.txt: --- top 1: some text ... bla top 2: more text ... bla bla --- Now I run the section parser in ghci against the above mentioned example text stored in /tmp/test.txt: *Main parseFromFile section /tmp/test.txt Right (Section (Top top 1) (Content )) I don't understand the behaviour of the content parser here. Why does it return ? Or perhaps more generally, I don't understand the manyTill combinator (though I read the docs). Side remark: of cause for this little task it is probably to much effort to use parsec. However, my content in fact has an internal structure which I would like to parse further, but I deliberately abstracted from these internals as they don't effect my above stated problem. Immanuel 2013/3/3 Andrey Chudnov achud...@gmail.com Immanuel, Since a heading always starts with a new line (and ends with a colon followed by a carriage return or just a colon?), I think it might be useful to first separate the input into lines and then classify them depending on whether it's a heading or not and reassemble them into the value you need. You don't even need parsec for that. However, if you really want to use parsec, you can write something like (warning, not tested): many $ liftM2 Section headline content where headline = anyChar `manyTill` (char ':' spaces newline) content = anyChar `manyTill` (try $ newline headline) /Andrey On 3/3/2013 10:44 AM, Immanuel Normann wrote: I am trying to parse a semi structured text with parsec that basically should identify sections. Each section starts with a headline and has an unstructured content - that's all. For instance, consider the following example text (inside the dashed lines): --- top 1: some text ... bla top 2: more text ... bla bla --- This should be parsed into a structure like this: [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content more text ... bla)] Say, I have a parser headline, but the content after a headline could be anything that is different from what headline parses. How could the section parser making use of headline look like? My idea would be to use the manyTill combinator, but I dont find an easy solution. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] simple parsec question
Immanuel, I tried but I couldn't figure it out. Here's a gist with my attempts and results so far: https://gist.github.com/achudnov/f3af65f11d5162c73064 There, 'test' uses my attempt at specifying the parser, 'test2' uses yours. Note that your attempt wouldn't parse multiple sections -- for that you need to use 'many section' instead of just 'section' in 'parse' ('parseFromFile' in the original). I think what's going on is the lookahead is wrong, but I'm not sure how exactly. I'll give it another go tomorrow if I have time. /Andrey On 03/03/2013 05:16 PM, Immanuel Normann wrote: Andrey, Thanks for your attempt, but it doesn't seem to work. The easy part is the headline, but the content makes trouble. Let me write the code a bit more explicit, so you can copy and paste it: -- {-# LANGUAGE FlexibleContexts #-} module Main where import Text.Parsec data Top = Top String deriving (Show) data Content = Content String deriving (Show) data Section = Section Top Content deriving (Show) headline :: Stream s m Char = ParsecT s u m Top headline = manyTill anyChar (char ':' newline) = return . Top content :: Stream s m Char = ParsecT s u m Content content = manyTill anyChar (try headline) = return . Content section :: Stream s m Char = ParsecT s u m Section section = do {h - headline; c - content; return (Section h c)} -- Assume the following example text is stored in /tmp/test.txt: --- top 1: some text ... bla top 2: more text ... bla bla --- Now I run the section parser in ghci against the above mentioned example text stored in /tmp/test.txt: *Main parseFromFile section /tmp/test.txt Right (Section (Top top 1) (Content )) I don't understand the behaviour of the content parser here. Why does it return ? Or perhaps more generally, I don't understand the manyTill combinator (though I read the docs). Side remark: of cause for this little task it is probably to much effort to use parsec. However, my content in fact has an internal structure which I would like to parse further, but I deliberately abstracted from these internals as they don't effect my above stated problem. Immanuel 2013/3/3 Andrey Chudnov achud...@gmail.com mailto:achud...@gmail.com Immanuel, Since a heading always starts with a new line (and ends with a colon followed by a carriage return or just a colon?), I think it might be useful to first separate the input into lines and then classify them depending on whether it's a heading or not and reassemble them into the value you need. You don't even need parsec for that. However, if you really want to use parsec, you can write something like (warning, not tested): many $ liftM2 Section headline content where headline = anyChar `manyTill` (char ':' spaces newline) content = anyChar `manyTill` (try $ newline headline) /Andrey On 3/3/2013 10:44 AM, Immanuel Normann wrote: I am trying to parse a semi structured text with parsec that basically should identify sections. Each section starts with a headline and has an unstructured content - that's all. For instance, consider the following example text (inside the dashed lines): --- top 1: some text ... bla top 2: more text ... bla bla --- This should be parsed into a structure like this: [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content more text ... bla)] Say, I have a parser headline, but the content after a headline could be anything that is different from what headline parses. How could the section parser making use of headline look like? My idea would be to use the manyTill combinator, but I dont find an easy solution. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A parsec question
Does this one give the expected error message for Parsec3.1 - unfortunately I can't test as I'm still using Parsec 2.1.0.1. parser = block (many digit ? digit) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parsec question (new user): unexpected end of input
Am 29.09.2010 05:35, schrieb Peter Schmitz: [...] Error parsing file: ...\sampleTaggedContent.txt (line 4, column 1): unexpected end of input expecting The input was: [...] -- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag many tagOrContent will consume all tags, so that no tag for the following aTag will be left. Cheers Christian eof return Parse complete. tagOrContent = aTag | someContent ? tagOrContent aTag = do tagBegin xs - many (noneOf [tagEndChar]) tagEnd optionalWhiteSpace return () someContent = do manyTill anyChar tagBegin return () optionalWhiteSpace = spaces -- i.e., any of \v\f\t\r\n tagBegin = char tagBeginChar tagEnd = char tagEndChar -- Etc: tagBeginChar = '' tagEndChar = '' ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parsec question (new user): unexpected end of input
Am 29.09.2010 09:54, schrieb Christian Maeder: Am 29.09.2010 05:35, schrieb Peter Schmitz: [...] Error parsing file: ...\sampleTaggedContent.txt (line 4, column 1): unexpected end of input expecting The input was: [...] -- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag many tagOrContent will consume all tags, so that no tag for the following aTag will be left. if you want to match a final tag, you could try: manyTill tagOrContent (try (aTag eof)) Cheers Christian eof return Parse complete. tagOrContent = aTag | someContent ? tagOrContent aTag = do tagBegin xs - many (noneOf [tagEndChar]) this also looks like manyTill anyChar tagEnd C. tagEnd optionalWhiteSpace return () someContent = do manyTill anyChar tagBegin return () optionalWhiteSpace = spaces -- i.e., any of \v\f\t\r\n tagBegin = char tagBeginChar tagEnd = char tagEndChar -- Etc: tagBeginChar = '' tagEndChar = '' ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parsec question (new user): unexpected end of input
Am 29.09.2010 11:55, schrieb Christian Maeder: Am 29.09.2010 09:54, schrieb Christian Maeder: Am 29.09.2010 05:35, schrieb Peter Schmitz: [...] Error parsing file: ...\sampleTaggedContent.txt (line 4, column 1): unexpected end of input expecting The input was: [...] -- Parsers: taggedContent = do optionalWhiteSpace aTag many tagOrContent aTag many tagOrContent will consume all tags, so that no tag for the following aTag will be left. if you want to match a final tag, you could try: manyTill tagOrContent (try (aTag eof)) better yet, avoiding backtracking, return different things for aTag and someContents and check if the last entry is a tag. tagOrContent = fmap Left aTag | fmap Right someContent taggedContent = do spaces aTag l - many tagOrContent eof case reverse l of Left _ : _ - return () _ - fail expected final tag before EOF C. Cheers Christian eof return Parse complete. tagOrContent = aTag | someContent ? tagOrContent aTag = do tagBegin xs - many (noneOf [tagEndChar]) this also looks like manyTill anyChar tagEnd C. tagEnd optionalWhiteSpace return () someContent = do manyTill anyChar tagBegin return () optionalWhiteSpace = spaces -- i.e., any of \v\f\t\r\n tagBegin = char tagBeginChar tagEnd = char tagEndChar -- Etc: tagBeginChar = '' tagEndChar = '' ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] A parsec question
I have a question about Parsec. The following program import Control.Applicative ((*),(*)) import Text.Parsec import Text.Parsec.Char block p = char '{' * p * char '}' parser = block (many digit) main = parseTest parser {123a} gives the output parse error at (line 1, column 5): unexpected a expecting } Note the last line mentions only '}'. I would rather like to see expecting } or digit since the parser could very well accept another digit here. (1) What is the reason for this behaviour? (2) Is there another combinator that behaves as I would like? (3) Otherwise, how do I write one myself? BTW, I am using parsec-3.1.0 and ghc-6.12.3. Cheers Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parsec question (new user): unexpected end of input
Antoine and Christian: Many thanks for your help on this thread. (I am still digesting it; much appreciated; will post when I get it working.) -- Peter ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parsec question
Michael Mossey wrote: Here's what I have so far. It works, but it's a bit weird to consume the // as part of the text rather than the keyword. That happens because the try( string // ), which is part of the end arg to manyTill, consumes the // when it succeeds. But maybe it is the most natural way to express the problem. use lookAhead! parseKeyword :: Parser String parseKeyword = many1 (alphaNum | char '_') parseKeyword = string // many1 (alphaNum | char '_') parseText :: Parser String parseText = manyTill anyChar ((try (string //) return ()) | eof) parseText = manyTill anyChar $ (lookAhead (try $ string //) return ()) | eof (untested) C. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parsec question
Erik de Castro Lopo schrieb: Erik de Castro Lopo wrote: qualifiedIdentifier :: CharParser st [ String ] Ahh, figured it out myself: qualifiedIdentifier :: CharParser st [ String ] qualifiedIdentifier = do i - identifier r - dotIdentifier return (i : r) where dotIdentifier = do char '.' i - identifier r - dotIdentifier return (i : r) | return [] Does that look sane to people who know Haskell and Parsec better than me? Hi Erik, have a look at the module Text.ParserCombinators.Parsec.Combinator. Those functions should help you to build up parsers from smaller building blocks. Using sepBy1, the above parser can be written as dot = T.dot lexer qualifiedIdentifier = sepBy1 identifier dot benedikt ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Parsec question
Benedikt Huber wrote: have a look at the module Text.ParserCombinators.Parsec.Combinator. Those functions should help you to build up parsers from smaller building blocks. Using sepBy1, the above parser can be written as dot = T.dot lexer qualifiedIdentifier = sepBy1 identifier dot WOW That is really impressive! Thanks, Erik -- - Erik de Castro Lopo - One serious obstacle to the adoption of good programming languages is the notion that everything has to be sacrificed for speed. In computer languages as in life, speed kills. -- Mike Vanier ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Parsec question
Benedikt Huber wrote: Using sepBy1, the above parser can be written as dot = T.dot lexer qualifiedIdentifier = sepBy1 identifier dot My next problem is matching things like: identifier ('.' identifier)* ('.' '*')? I've had a look at lookAhead from Text.ParserCombinators.Parsec.Combinator but I can't get it to work. Clues? Erik -- - Erik de Castro Lopo - That being done, all you have to do next is call free() slightly less often than malloc(). You may want to examine the Solaris system libraries for a particularly ambitious implementation of this technique. -- Eric O'Dell (comp.lang.dylan) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Parsec question
On Wed, Dec 24, 2008 at 9:22 AM, Erik de Castro Lopo mle...@mega-nerd.com wrote: My next problem is matching things like: identifier ('.' identifier)* ('.' '*')? I've had a look at lookAhead from Text.ParserCombinators.Parsec.Combinator but I can't get it to work. * is analogous to the 'many' combinator, and ? can be implemented with the 'option' combinator. Parsec is all about composing bigger parsers out of smaller ones using combinators like these. One of the tricks I found early on is to understand where to use 'try' (since by default input is consumed even if a parser fails) but apart from that just read Daan's page, even though it's out of date, and look at how all these cool combinators work. http://legacy.cs.uu.nl/daan/download/parsec/parsec.html ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Parsec question
Toby Hutton wrote: One of the tricks I found early on is to understand where to use 'try' (since by default input is consumed even if a parser fails) but apart from that just read Daan's page, even though it's out of date, and look at how all these cool combinators work. http://legacy.cs.uu.nl/daan/download/parsec/parsec.html Ah yes, reading that document and using 'try' is a good tip. This is what I cam up with: qualifiedIdentStar :: CharParser st [ String ] qualifiedIdentStar = do try identDotStar | qualifiedIdentifier where identDotStar = do s - sepEndBy1 identifier dot char '*' return (s ++ [ * ]) Cheers, Erik -- - Erik de Castro Lopo - It has been discovered that C++ provides a remarkable facility for concealing the trival details of a program -- such as where its bugs are. -- David Keppel ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parsec Question
Hi Gerd, despite SourcePos being abstract, it can be fully manipulated using newPos. import Text.ParserCombinators.Parsec.Pos If you can compute the positions from your Tok-stream then you may consider using tokenPrim and work with GenParser Tok () a HTH Christian Gerd M wrote: I'm trying to use parsec for parsing a custom input stream. As far as I understood the manual correctly I need to define the primitive parser: type MyParser a = GenParser (SourcePos,Tok) () a mytoken :: (Tok - Maybe a) - MyParser a mytoken test = token showToken posToken testToken where showToken (pos,tok) = show tok posToken (pos,tok) = pos testToken (pos,tok) = test tok The problem is, since SourcePos is an abstract datatype, how can I actually run this parser without explicitly using values of type SourcePos in the input stream? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Parsec Question
despite SourcePos being abstract, it can be fully manipulated using newPos. Thanks for the tip, I thought it wasn't exported. Gerd M wrote: I'm trying to use parsec for parsing a custom input stream. As far as I understood the manual correctly I need to define the primitive parser: type MyParser a = GenParser (SourcePos,Tok) () a mytoken :: (Tok - Maybe a) - MyParser a mytoken test = token showToken posToken testToken where showToken (pos,tok) = show tok posToken (pos,tok) = pos testToken (pos,tok) = test tok The problem is, since SourcePos is an abstract datatype, how can I actually run this parser without explicitly using values of type SourcePos in the input stream? _ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe