On Mon, Feb 18, 2013 at 10:01:29AM +0100, Michele Tartara wrote:
> On Mon, Feb 18, 2013 at 9:54 AM, Iustin Pop <[email protected]> wrote:
> 
> > On Fri, Feb 15, 2013 at 09:24:40PM +0100, Michele Tartara wrote:
> > > On Fri, Feb 15, 2013 at 10:33 AM, Iustin Pop <[email protected]> wrote:
> > >
> > > > On Fri, Feb 15, 2013 at 10:21:46AM +0100, Michele Tartara wrote:
> > > > > On Thu, Feb 14, 2013 at 2:53 PM, Iustin Pop <[email protected]>
> > wrote:
> > > > >
> > > > > > On Wed, Feb 06, 2013 at 01:09:51PM +0100, Michele Tartara wrote:
> > > > > > > In order to fetch precise information about the status of the VMs
> > > > > > running in
> > > > > > > Xen, we need to analyze the output of the "xm list --long"
> > command.
> > > > > > >
> > > > > > > This commit adds the parser to do that, and its tests.
> > > >
> > > > OK, thanks for all the acks, LGTM on submitting it with them.
> > > >
> > > > thanks,
> > > > iustin
> > > >
> > >
> > >
> > > While switching from my genNonEmptyString to genName, I found out a bug
> > in
> > > the parser.
> > > If a string like "9a" was received, it would have been parsed as LCDouble
> > > 9, instead of LCString 9a.
> > >
> > > In order to fix the bug, I needed to use a lookahead of 1 character to
> > > determine if after the number there was an alphabetic caracter or the
> > > actual end of the number.
> > >
> > > This required some non-trivial modifications (especially because
> > > Data.Attoparsec.Text does not support lookahead, whereas
> > > Data.Attoparsec.ByteString.Char8 does).
> >
> > Hmm. Can't you fix this without lookahead, by requiring the "leftover"
> > string to be empty? IIRC you use that model already in the DRBD parser…
> >
> >
> Yes, but that is just the most simple case. The exact same problem happens
> in a configuration like this:
> (name 9a (100 foo))
> or in a string like this:
> (field 40rg)
> 
> Not just in the trivial string-only case. A number is actually a number
> only if it is followed by a whitespace or a ')'.
> The end of input is just a very specific case.

Of course, if you mean end of entire string. I was referring only to
things like (field 40org), where you have a 'virtual' end-of-string
after 40org. But if you have to support (name 9a (…)), then it won't
work, indeed.

> There is probably a way to check this without lookahead, but I haven't
> found it yet (and anyway I expect it likely to make the parser quite
> unreadable).

So something has to do the lookahead in any case, due to the input
format. I was just thinking if there is a way to push the actual
lookahead to Attoparsec, instead of you doing it manually. Something
like:

  manyTill digit (try (eitherP (string ")") (string " "))))

Unfortunately this will eat the end of the string, which maybe is not
what you want.

But after our offline conversations, I realised something else: the
problem is, I think, your manyTill, which allows multiple
listConfigParsers near each other. I think the proper way to fix this
(without explicit lookahead) would be to require them to be separated by
whitespace, e.g. something along the lines:

  where listConfigP =
    (A.char '(' *> AC.sepBy1 lispConfigParser whiteSpaceP <* A.char ')')
    >>= pure . LCList

However, this doesn't work, exactly because of the '123a' being
misparsed. Grrr :)

thanks,
iustin

Reply via email to