Re: [Haskell-cafe] simple parsec question

2013-03-05 Thread Immanuel Normann
Carlo,

Thanks a lot! This looks very promising (though I have to test it for my
purpose more in depth). As you mention, the key seems to be the optionMaybe
combinator. Thanks for pointing to it.

Immanuel


2013/3/5 Carlo Hamalainen carlo.hamalai...@gmail.com

 On Mon, Mar 4, 2013 at 1:44 AM, Immanuel Normann 
 immanuel.norm...@googlemail.com wrote:

 I am trying to parse a semi structured text with parsec that basically
 should identify sections. Each section starts with a headline and has an
 unstructured content - that's all.


 Here's my attempt: https://gist.github.com/carlohamalainen/5087207

 {-# LANGUAGE FlexibleContexts #-}

 import Text.Parsec
 import Control.Applicative hiding ((|),many)

 -- Example input:

 {-
 top 1:

 some text ... bla

 top 2:

 more text ... bla bla

 -}

 data Top = Top String deriving (Show)
 data Content = Content [String] deriving (Show)
 data Section = Section Top Content deriving (Show)

 headline = do
 t - many1 (noneOf :\n)
 char ':'
 newline

 return $ Top t

 contentLine = do
 x - many (noneOf :\n)
 newline
 return x

 content = do
 line - optionMaybe (try contentLine)

 case line of Just x - do xs - content
   return (x:xs)
  _  - return []

 section = do
 h - headline
 c - Content $ content
 return $ Section h c

 main = do
 x - readFile simple.txt
 print $ parse (many section)  x


 Example run using your sample data:

 $ runhaskell Simple.hs
 Right [Section (Top top 1) (Content [,some text ... bla,]),Section
 (Top top 2) (Content [,more text ... bla bla,])]

 Notes:

 * I had to assume that a content line does not contain a ':', because that
 is the only way to distinguish a head-line (correct me if I'm wrong).

 * The key was to use optionMaybe along with try; see the definition of
 content.

 * I haven't tested this code on very large inputs.

 * I slightly changed the definition of Content to have a list of Strings,
 one for each line. I'm sure this could be altered if you wanted to retain
 all whitespace.

 * I am still new to Parsec, so don't take this as the definitive answer ;-)

 --
 Carlo Hamalainen
 http://carlo-hamalainen.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] simple parsec question

2013-03-05 Thread S. Doaitse Swierstra
Maybe this is something you do not even want to use a parser combinator library 
for. The package

http://hackage.haskell.org/packages/archive/list-grouping/0.1.1/doc/html/Data-List-Grouping.html

contains a function breakBefore, so you can write

main = do inp - readFile ...
 let result = map mkSection . breakBefore ((= ':').last)). 
lines $ inp

mkSection (l:ll) = Section (Top l) (Contents ll)

Doaitse


On Mar 3, 2013, at 16:44 , Immanuel Normann immanuel.norm...@googlemail.com 
wrote:

 Hi,
 
 I am trying to parse a semi structured text with parsec that basically should 
 identify sections. Each section starts with a headline and has an 
 unstructured content - that's all. For instance, consider the following 
 example text (inside the dashed lines):
 
 ---
 
 top 1:
 
 some text ... bla
 
 top 2:
 
 more text ... bla bla
 
 
 ---
 
 This should be parsed into a structure like this:
 
 [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content 
 more text ... bla)]
 
 Say, I have a parser headline, but the content after a headline could be 
 anything that is different from what headline parses.
 How could the section parser making use of headline look like?
 My idea would be to use the manyTill combinator, but I dont find an easy 
 solution.
 
 Many thanks for any hint
 
 Immanuel
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] simple parsec question

2013-03-04 Thread Immanuel Normann
Andrey,
Thanks a lot for your effort! I have the same suspect that the lookahead in
the content parser is the problem, but I don't know how to solve it either.
At least the I learned from your code that noneOf is also a quite useful
parser in this context which I have overlooked.
Anyway, if you find a solution it would be great! In the end the task
itself doesn't look very specific, but rather general: an alternation
between strictly (the headline in my case) and loosely (the content in my
case) structured text. It shouldn't be difficult to build a parser for such
a setting.

(btw. I am aware the my test parser would (or rather should) parse only the
first section. For testing this would be sufficient.)



2013/3/4 Andrey Chudnov achud...@gmail.com

  Immanuel,
 I tried but I couldn't figure it out. Here's a gist with my attempts and
 results so far: https://gist.github.com/achudnov/f3af65f11d5162c73064There, 
 'test' uses my attempt at specifying the parser, 'test2' uses yours.
 Note that your attempt wouldn't parse multiple sections -- for that you
 need to use 'many section' instead of just 'section' in 'parse'
 ('parseFromFile' in the original).
 I think what's going on is the lookahead is wrong, but I'm not sure how
 exactly. I'll give it another go tomorrow if I have time.

 /Andrey


 On 03/03/2013 05:16 PM, Immanuel Normann wrote:

Andrey,

  Thanks for your attempt, but it doesn't seem to work. The easy part is
 the headline, but the content makes trouble.

 Let me write the code a bit more explicit, so you can copy and paste it:

 --
 {-# LANGUAGE FlexibleContexts #-}

 module Main where

 import Text.Parsec

 data Top = Top String deriving (Show)
 data Content = Content String deriving (Show)
 data Section = Section Top Content deriving (Show)

 headline :: Stream s m Char = ParsecT s u m Top
 headline = manyTill anyChar (char ':'  newline) = return . Top

 content :: Stream s m Char = ParsecT s u m Content
 content = manyTill anyChar (try headline) = return . Content

 section :: Stream s m Char = ParsecT s u m Section
 section = do {h - headline; c - content; return (Section h c)}
 --


  Assume the following example text is stored in  /tmp/test.txt:
 ---
 top 1:

 some text ... bla

 top 2:

 more text ... bla bla
 ---

  Now I run the section parser in ghci against the above mentioned example
 text stored in /tmp/test.txt:

 *Main parseFromFile section /tmp/test.txt
 Right (Section (Top top 1) (Content ))

  I don't understand the behaviour of the content parser here. Why does it
 return ? Or perhaps more generally, I don't understand the manyTill
 combinator (though I read the docs).

  Side remark: of cause for this little task it is probably to much effort
 to use parsec. However, my content in fact has an internal structure which
 I would like to parse further, but I deliberately abstracted from these
 internals as they don't effect my above stated problem.

  Immanuel


 2013/3/3 Andrey Chudnov achud...@gmail.com

 Immanuel,
 Since a heading always starts with a new line (and ends with a colon
 followed by a carriage return or just a colon?), I think it might be useful
 to first separate the input into lines and then classify them depending on
 whether it's a heading or not and reassemble them into the value you need.
 You don't even need parsec for that.

 However, if you really want to use parsec, you can write something like
 (warning, not tested):
 many $ liftM2 Section headline content
where headline = anyChar `manyTill` (char ':'  spaces  newline)
content  = anyChar `manyTill` (try $ newline  headline)

 /Andrey


 On 3/3/2013 10:44 AM, Immanuel Normann wrote:

 I am trying to parse a semi structured text with parsec that basically
 should identify sections. Each section starts with a headline and has an
 unstructured content - that's all. For instance, consider the following
 example text (inside the dashed lines):

 ---

 top 1:

 some text ... bla

 top 2:

 more text ... bla bla


 ---

 This should be parsed into a structure like this:

 [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content
 more text ... bla)]

 Say, I have a parser headline, but the content after a headline could
 be anything that is different from what headline parses.
 How could the section parser making use of headline look like?
 My idea would be to use the manyTill combinator, but I dont find an
 easy solution.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] simple parsec question

2013-03-04 Thread Carlo Hamalainen
On Mon, Mar 4, 2013 at 1:44 AM, Immanuel Normann 
immanuel.norm...@googlemail.com wrote:

 I am trying to parse a semi structured text with parsec that basically
 should identify sections. Each section starts with a headline and has an
 unstructured content - that's all.


Here's my attempt: https://gist.github.com/carlohamalainen/5087207

{-# LANGUAGE FlexibleContexts #-}

import Text.Parsec
import Control.Applicative hiding ((|),many)

-- Example input:

{-
top 1:

some text ... bla

top 2:

more text ... bla bla

-}

data Top = Top String deriving (Show)
data Content = Content [String] deriving (Show)
data Section = Section Top Content deriving (Show)

headline = do
t - many1 (noneOf :\n)
char ':'
newline

return $ Top t

contentLine = do
x - many (noneOf :\n)
newline
return x

content = do
line - optionMaybe (try contentLine)

case line of Just x - do xs - content
  return (x:xs)
 _  - return []

section = do
h - headline
c - Content $ content
return $ Section h c

main = do
x - readFile simple.txt
print $ parse (many section)  x


Example run using your sample data:

$ runhaskell Simple.hs
Right [Section (Top top 1) (Content [,some text ... bla,]),Section
(Top top 2) (Content [,more text ... bla bla,])]

Notes:

* I had to assume that a content line does not contain a ':', because that
is the only way to distinguish a head-line (correct me if I'm wrong).

* The key was to use optionMaybe along with try; see the definition of
content.

* I haven't tested this code on very large inputs.

* I slightly changed the definition of Content to have a list of Strings,
one for each line. I'm sure this could be altered if you wanted to retain
all whitespace.

* I am still new to Parsec, so don't take this as the definitive answer ;-)

-- 
Carlo Hamalainen
http://carlo-hamalainen.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] simple parsec question

2013-03-03 Thread Immanuel Normann
Hi,

I am trying to parse a semi structured text with parsec that basically
should identify sections. Each section starts with a headline and has an
unstructured content - that's all. For instance, consider the following
example text (inside the dashed lines):

---

top 1:

some text ... bla

top 2:

more text ... bla bla


---

This should be parsed into a structure like this:

[Section (Top 1) (Content some text ... bla), Section (Top 1) (Content
more text ... bla)]

Say, I have a parser headline, but the content after a headline could be
anything that is different from what headline parses.
How could the section parser making use of headline look like?
My idea would be to use the manyTill combinator, but I dont find an easy
solution.

Many thanks for any hint

Immanuel
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] simple parsec question

2013-03-03 Thread Andrey Chudnov

Immanuel,
Since a heading always starts with a new line (and ends with a colon 
followed by a carriage return or just a colon?), I think it might be 
useful to first separate the input into lines and then classify them 
depending on whether it's a heading or not and reassemble them into the 
value you need. You don't even need parsec for that.


However, if you really want to use parsec, you can write something like 
(warning, not tested):

many $ liftM2 Section headline content
   where headline = anyChar `manyTill` (char ':'  spaces  newline)
   content  = anyChar `manyTill` (try $ newline  headline)

/Andrey

On 3/3/2013 10:44 AM, Immanuel Normann wrote:
I am trying to parse a semi structured text with parsec that basically 
should identify sections. Each section starts with a headline and has 
an unstructured content - that's all. For instance, consider the 
following example text (inside the dashed lines):


---

top 1:

some text ... bla

top 2:

more text ... bla bla


---

This should be parsed into a structure like this:

[Section (Top 1) (Content some text ... bla), Section (Top 1) 
(Content more text ... bla)]


Say, I have a parser headline, but the content after a headline 
could be anything that is different from what headline parses.

How could the section parser making use of headline look like?
My idea would be to use the manyTill combinator, but I dont find an 
easy solution.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] simple parsec question

2013-03-03 Thread Immanuel Normann
Andrey,

Thanks for your attempt, but it doesn't seem to work. The easy part is the
headline, but the content makes trouble.

Let me write the code a bit more explicit, so you can copy and paste it:

--
{-# LANGUAGE FlexibleContexts #-}

module Main where

import Text.Parsec

data Top = Top String deriving (Show)
data Content = Content String deriving (Show)
data Section = Section Top Content deriving (Show)

headline :: Stream s m Char = ParsecT s u m Top
headline = manyTill anyChar (char ':'  newline) = return . Top

content :: Stream s m Char = ParsecT s u m Content
content = manyTill anyChar (try headline) = return . Content

section :: Stream s m Char = ParsecT s u m Section
section = do {h - headline; c - content; return (Section h c)}
--


Assume the following example text is stored in  /tmp/test.txt:
---
top 1:

some text ... bla

top 2:

more text ... bla bla
---

Now I run the section parser in ghci against the above mentioned example
text stored in /tmp/test.txt:

*Main parseFromFile section /tmp/test.txt
Right (Section (Top top 1) (Content ))

I don't understand the behaviour of the content parser here. Why does it
return ? Or perhaps more generally, I don't understand the manyTill
combinator (though I read the docs).

Side remark: of cause for this little task it is probably to much effort to
use parsec. However, my content in fact has an internal structure which I
would like to parse further, but I deliberately abstracted from these
internals as they don't effect my above stated problem.

Immanuel


2013/3/3 Andrey Chudnov achud...@gmail.com

 Immanuel,
 Since a heading always starts with a new line (and ends with a colon
 followed by a carriage return or just a colon?), I think it might be useful
 to first separate the input into lines and then classify them depending on
 whether it's a heading or not and reassemble them into the value you need.
 You don't even need parsec for that.

 However, if you really want to use parsec, you can write something like
 (warning, not tested):
 many $ liftM2 Section headline content
where headline = anyChar `manyTill` (char ':'  spaces  newline)
content  = anyChar `manyTill` (try $ newline  headline)

 /Andrey


 On 3/3/2013 10:44 AM, Immanuel Normann wrote:

 I am trying to parse a semi structured text with parsec that basically
 should identify sections. Each section starts with a headline and has an
 unstructured content - that's all. For instance, consider the following
 example text (inside the dashed lines):

 ---

 top 1:

 some text ... bla

 top 2:

 more text ... bla bla


 ---

 This should be parsed into a structure like this:

 [Section (Top 1) (Content some text ... bla), Section (Top 1) (Content
 more text ... bla)]

 Say, I have a parser headline, but the content after a headline could
 be anything that is different from what headline parses.
 How could the section parser making use of headline look like?
 My idea would be to use the manyTill combinator, but I dont find an
 easy solution.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] simple parsec question

2013-03-03 Thread Andrey Chudnov
Immanuel,
I tried but I couldn't figure it out. Here's a gist with my attempts and
results so far: https://gist.github.com/achudnov/f3af65f11d5162c73064
There, 'test' uses my attempt at specifying the parser, 'test2' uses
yours. Note that your attempt wouldn't parse multiple sections -- for
that you need to use 'many section' instead of just 'section' in 'parse'
('parseFromFile' in the original).
I think what's going on is the lookahead is wrong, but I'm not sure how
exactly. I'll give it another go tomorrow if I have time.

/Andrey

On 03/03/2013 05:16 PM, Immanuel Normann wrote:
 Andrey,

 Thanks for your attempt, but it doesn't seem to work. The easy part is
 the headline, but the content makes trouble.

 Let me write the code a bit more explicit, so you can copy and paste it:

 --
 {-# LANGUAGE FlexibleContexts #-}

 module Main where

 import Text.Parsec

 data Top = Top String deriving (Show)
 data Content = Content String deriving (Show)
 data Section = Section Top Content deriving (Show)

 headline :: Stream s m Char = ParsecT s u m Top
 headline = manyTill anyChar (char ':'  newline) = return . Top

 content :: Stream s m Char = ParsecT s u m Content
 content = manyTill anyChar (try headline) = return . Content

 section :: Stream s m Char = ParsecT s u m Section
 section = do {h - headline; c - content; return (Section h c)}
 --


 Assume the following example text is stored in  /tmp/test.txt:
 ---
 top 1:

 some text ... bla

 top 2:

 more text ... bla bla
 ---

 Now I run the section parser in ghci against the above mentioned
 example text stored in /tmp/test.txt:

 *Main parseFromFile section /tmp/test.txt
 Right (Section (Top top 1) (Content ))

 I don't understand the behaviour of the content parser here. Why does
 it return ? Or perhaps more generally, I don't understand the
 manyTill combinator (though I read the docs).

 Side remark: of cause for this little task it is probably to much
 effort to use parsec. However, my content in fact has an internal
 structure which I would like to parse further, but I deliberately
 abstracted from these internals as they don't effect my above stated
 problem.

 Immanuel


 2013/3/3 Andrey Chudnov achud...@gmail.com mailto:achud...@gmail.com

 Immanuel,
 Since a heading always starts with a new line (and ends with a
 colon followed by a carriage return or just a colon?), I think it
 might be useful to first separate the input into lines and then
 classify them depending on whether it's a heading or not and
 reassemble them into the value you need. You don't even need
 parsec for that.

 However, if you really want to use parsec, you can write something
 like (warning, not tested):
 many $ liftM2 Section headline content
where headline = anyChar `manyTill` (char ':'  spaces  newline)
content  = anyChar `manyTill` (try $ newline 
 headline)

 /Andrey


 On 3/3/2013 10:44 AM, Immanuel Normann wrote:

 I am trying to parse a semi structured text with parsec that
 basically should identify sections. Each section starts with a
 headline and has an unstructured content - that's all. For
 instance, consider the following example text (inside the
 dashed lines):

 ---

 top 1:

 some text ... bla

 top 2:

 more text ... bla bla


 ---

 This should be parsed into a structure like this:

 [Section (Top 1) (Content some text ... bla), Section (Top
 1) (Content more text ... bla)]

 Say, I have a parser headline, but the content after a
 headline could be anything that is different from what
 headline parses.
 How could the section parser making use of headline look like?
 My idea would be to use the manyTill combinator, but I dont
 find an easy solution.



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A parsec question

2010-10-03 Thread Stephen Tetley
Does this one give the expected error message for Parsec3.1 -
unfortunately I can't test as I'm still using Parsec 2.1.0.1.

 parser = block (many digit ? digit)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parsec question (new user): unexpected end of input

2010-09-29 Thread Christian Maeder
Am 29.09.2010 05:35, schrieb Peter Schmitz:
[...]
 Error parsing file: ...\sampleTaggedContent.txt (line 4, column 1):
 unexpected end of input
 expecting 
 
 The input was:
[...]
 
 -- Parsers:
 taggedContent = do
optionalWhiteSpace
aTag
many tagOrContent
aTag

many tagOrContent will consume all tags, so that no tag for the
following aTag will be left.

Cheers Christian

eof
return Parse complete.

 tagOrContent = aTag | someContent ? tagOrContent

 aTag = do
tagBegin
xs - many (noneOf [tagEndChar])
tagEnd
optionalWhiteSpace
return ()

 someContent = do
manyTill anyChar tagBegin
return ()

 optionalWhiteSpace = spaces   -- i.e., any of  \v\f\t\r\n
 tagBegin = char tagBeginChar
 tagEnd = char tagEndChar

 -- Etc:
 tagBeginChar = ''
 tagEndChar = ''
 
 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parsec question (new user): unexpected end of input

2010-09-29 Thread Christian Maeder
Am 29.09.2010 09:54, schrieb Christian Maeder:
 Am 29.09.2010 05:35, schrieb Peter Schmitz:
 [...]
 Error parsing file: ...\sampleTaggedContent.txt (line 4, column 1):
 unexpected end of input
 expecting 

 The input was:
 [...]

 -- Parsers:
 taggedContent = do
optionalWhiteSpace
aTag
many tagOrContent
aTag
 
 many tagOrContent will consume all tags, so that no tag for the
 following aTag will be left.

if you want to match a final tag, you could try:

  manyTill tagOrContent (try (aTag  eof))

 
 Cheers Christian
 
eof
return Parse complete.

 tagOrContent = aTag | someContent ? tagOrContent

 aTag = do
tagBegin
xs - many (noneOf [tagEndChar])

this also looks like manyTill anyChar tagEnd

C.

tagEnd
optionalWhiteSpace
return ()

 someContent = do
manyTill anyChar tagBegin
return ()

 optionalWhiteSpace = spaces   -- i.e., any of  \v\f\t\r\n
 tagBegin = char tagBeginChar
 tagEnd = char tagEndChar

 -- Etc:
 tagBeginChar = ''
 tagEndChar = ''

 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parsec question (new user): unexpected end of input

2010-09-29 Thread Christian Maeder
Am 29.09.2010 11:55, schrieb Christian Maeder:
 Am 29.09.2010 09:54, schrieb Christian Maeder:
 Am 29.09.2010 05:35, schrieb Peter Schmitz:
 [...]
 Error parsing file: ...\sampleTaggedContent.txt (line 4, column 1):
 unexpected end of input
 expecting 

 The input was:
 [...]

 -- Parsers:
 taggedContent = do
optionalWhiteSpace
aTag
many tagOrContent
aTag

 many tagOrContent will consume all tags, so that no tag for the
 following aTag will be left.
 
 if you want to match a final tag, you could try:
 
   manyTill tagOrContent (try (aTag  eof))

better yet, avoiding backtracking, return different things for aTag and
someContents and check if the last entry is a tag.

  tagOrContent = fmap Left aTag | fmap Right someContent

  taggedContent = do
   spaces
   aTag
   l - many tagOrContent
   eof
   case reverse l of
 Left _ : _ - return ()
 _ - fail expected final tag before EOF

C.


 Cheers Christian

eof
return Parse complete.

 tagOrContent = aTag | someContent ? tagOrContent

 aTag = do
tagBegin
xs - many (noneOf [tagEndChar])
 
 this also looks like manyTill anyChar tagEnd
 
 C.
 
tagEnd
optionalWhiteSpace
return ()

 someContent = do
manyTill anyChar tagBegin
return ()

 optionalWhiteSpace = spaces   -- i.e., any of  \v\f\t\r\n
 tagBegin = char tagBeginChar
 tagEnd = char tagEndChar

 -- Etc:
 tagBeginChar = ''
 tagEndChar = ''

 
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] A parsec question

2010-09-29 Thread Ben Franksen
I have a question about Parsec. The following program

 import Control.Applicative ((*),(*))
 import Text.Parsec
 import Text.Parsec.Char
 block p = char '{' * p * char '}'
 parser = block (many digit)
 main = parseTest parser {123a}

gives the output

  parse error at (line 1, column 5):
  unexpected a
  expecting }

Note the last line mentions only '}'. I would rather like to see

  expecting } or digit

since the parser could very well accept another digit here.

(1) What is the reason for this behaviour?
(2) Is there another combinator that behaves as I would like?
(3) Otherwise, how do I write one myself?

BTW, I am using parsec-3.1.0 and ghc-6.12.3.

Cheers
Ben

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parsec question (new user): unexpected end of input

2010-09-29 Thread Peter Schmitz
Antoine and Christian:
Many thanks for your help on this thread.
(I am still digesting it; much appreciated; will post when I get it working.)
-- Peter
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parsec question

2009-04-17 Thread Christian Maeder
Michael Mossey wrote:
 Here's what I have so far. It works, but it's a bit weird to consume the
 // as part of the text rather than the keyword. That happens because the
 try( string // ), which is part of the end arg to manyTill, consumes
 the // when it succeeds. But maybe it is the most natural way to express
 the problem.

use lookAhead!

 parseKeyword :: Parser String
 parseKeyword = many1 (alphaNum | char '_')

  parseKeyword = string //  many1 (alphaNum | char '_')

 parseText :: Parser String
 parseText = manyTill anyChar ((try (string //)  return ())
   | eof)

  parseText = manyTill anyChar
$ (lookAhead (try $ string //)  return ())
  | eof

(untested)
C.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parsec question

2008-12-23 Thread Benedikt Huber

Erik de Castro Lopo schrieb:

Erik de Castro Lopo wrote:


qualifiedIdentifier :: CharParser st [ String ]


Ahh, figured it out myself:

qualifiedIdentifier :: CharParser st [ String ]
qualifiedIdentifier = do
i - identifier
r - dotIdentifier
return (i : r)
where
dotIdentifier = do
char '.'
i - identifier
r - dotIdentifier
return (i  : r)
| return []

Does that look sane to people who know Haskell and Parsec
better than  me?

Hi Erik,
have a look at the module Text.ParserCombinators.Parsec.Combinator.
Those functions should help you to build up parsers from smaller 
building blocks.


Using sepBy1, the above parser can be written as

dot = T.dot lexer
qualifiedIdentifier = sepBy1 identifier dot

benedikt
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Parsec question

2008-12-23 Thread Erik de Castro Lopo
Benedikt Huber wrote:

 have a look at the module Text.ParserCombinators.Parsec.Combinator.
 Those functions should help you to build up parsers from smaller 
 building blocks.
 
 Using sepBy1, the above parser can be written as
 
  dot = T.dot lexer
  qualifiedIdentifier = sepBy1 identifier dot

WOW That is really impressive!

Thanks,
Erik
-- 
-
Erik de Castro Lopo
-
One serious obstacle to the adoption of good programming languages is
the notion that everything has to be sacrificed for speed. In computer
languages as in life, speed kills. -- Mike Vanier
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Parsec question

2008-12-23 Thread Erik de Castro Lopo
Benedikt Huber wrote:

 Using sepBy1, the above parser can be written as
 
  dot = T.dot lexer
  qualifiedIdentifier = sepBy1 identifier dot

My next problem is matching things like:

   identifier  ('.' identifier)*   ('.' '*')?

I've had a look at lookAhead from Text.ParserCombinators.Parsec.Combinator
but I can't get it to work.

Clues?

Erik
-- 
-
Erik de Castro Lopo
-
That being done, all you have to do next is call free() slightly
less often than malloc(). You may want to examine the Solaris
system libraries for a particularly ambitious implementation of
this technique.
-- Eric O'Dell (comp.lang.dylan)
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Parsec question

2008-12-23 Thread Toby Hutton
On Wed, Dec 24, 2008 at 9:22 AM, Erik de Castro Lopo
mle...@mega-nerd.com wrote:

 My next problem is matching things like:

   identifier  ('.' identifier)*   ('.' '*')?

 I've had a look at lookAhead from Text.ParserCombinators.Parsec.Combinator
 but I can't get it to work.

* is analogous to the 'many' combinator, and ? can be implemented with
the 'option' combinator.  Parsec is all about composing bigger parsers
out of smaller ones using combinators like these.

One of the tricks I found early on is to understand where to use 'try'
(since by default input is consumed even if a parser fails) but apart
from that just read Daan's page, even though it's out of date, and
look at how all these cool combinators work.

http://legacy.cs.uu.nl/daan/download/parsec/parsec.html
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Parsec question

2008-12-23 Thread Erik de Castro Lopo
Toby Hutton wrote:

 One of the tricks I found early on is to understand where to use 'try'
 (since by default input is consumed even if a parser fails) but apart
 from that just read Daan's page, even though it's out of date, and
 look at how all these cool combinators work.
 
 http://legacy.cs.uu.nl/daan/download/parsec/parsec.html

Ah yes, reading that document and using 'try' is a good tip. This
is what I cam up with:

qualifiedIdentStar :: CharParser st [ String ]
qualifiedIdentStar = do
try identDotStar
| qualifiedIdentifier
where
identDotStar = do
s - sepEndBy1 identifier dot
char '*'
return (s ++ [ * ])


Cheers,
Erik

-- 
-
Erik de Castro Lopo
-
It has been discovered that C++ provides a remarkable facility
for concealing the trival details of a program -- such as where
its bugs are. -- David Keppel
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parsec Question

2006-01-09 Thread Christian Maeder

Hi Gerd,

despite SourcePos being abstract, it can be fully manipulated using newPos.

import Text.ParserCombinators.Parsec.Pos

If you can compute the positions from your Tok-stream then you may 
consider using tokenPrim and work with GenParser Tok () a


HTH Christian

Gerd M wrote:
I'm trying to use parsec for parsing a custom input stream. As far as I 
understood the manual correctly I need to define the primitive parser:


type MyParser a   = GenParser (SourcePos,Tok) () a
mytoken :: (Tok - Maybe a) - MyParser a
mytoken test
 = token showToken posToken testToken
 where
   showToken (pos,tok)   = show tok
   posToken  (pos,tok)   = pos
   testToken (pos,tok)   = test tok

The problem is, since SourcePos is an abstract datatype, how can I 
actually run this parser without explicitly using values of type 
SourcePos in the input stream?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Parsec Question

2006-01-09 Thread Gerd M

despite SourcePos being abstract, it can be fully manipulated using newPos.

Thanks for the tip, I thought it wasn't exported.





Gerd M wrote:
I'm trying to use parsec for parsing a custom input stream. As far as I 
understood the manual correctly I need to define the primitive parser:


type MyParser a   = GenParser (SourcePos,Tok) () a
mytoken :: (Tok - Maybe a) - MyParser a
mytoken test
 = token showToken posToken testToken
 where
   showToken (pos,tok)   = show tok
   posToken  (pos,tok)   = pos
   testToken (pos,tok)   = test tok

The problem is, since SourcePos is an abstract datatype, how can I 
actually run this parser without explicitly using values of type SourcePos 
in the input stream?


_
Express yourself instantly with MSN Messenger! Download today it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe