On Wed, Jun 25, 2003 at 01:09:11PM +0100, Orton, Yves wrote: > Have you considered that it probably makes much more sense to resolve this > before parsing? > > Why not strip these out first? > > s/(?<!\\)#.$//mg; > s/\\\n//g;
I originally did this, and it worked . . . until you get an error in the config file, at which point the line numbers were wrong, the surrounding context was wrong, and it was much more difficult to tell where the problem was. Also I'm stubborn, and having implemented comment skipping and \newline continuation with lex and yacc I figured I'd be able to do it with P::RD :) > > I've tried various combinations of backslashes incase it's an > > interpolation issue, but that hasn't helped. At this stage I'm stuck. > > Youll kick yourself. It works just fine with the regex: > > $Parse::RecDescent::skip=qr/(?:\s*#.*\n|\s*\\\s*\n|\s*)*/; This will match " \\ \n" though, whereas I want it to match only " \\\n". Inserting the \s* does seem to solve some problem with having that many backslashes in a row though. > $Parse::RecDescent::skip=qr/(?:\s*#.*\n|\s*[\\]\s*\n|\s*)*/; I tried the character class too, with no luck :( > The key issue being the (\s*\\\n) on a win32 machine (where i am running) is > that wont match as the text is actually (\s*\\\r\n) by sticking an \s* in > between the backslash and the \n this is gobbled up and no problem. I > suspect that if you are on unix that sometimes you are actually encountering > "\\ \n" or the like. I'm only running on Unix currently, but may need to process windows created files. I'll work that in to the regex. > Also, is # ONLY used for comments? I would change the regex to be > > $Parse::RecDescent::skip=qr/\s*(?:(?!<[\\\\])#.*\n|[\\\\]\s*\n)*/; Eh, too many backslashes! Need to study this. Having to use '[\\\\]' seems wrong though - too many levels of interpolation. '(?!<[\\\\])' - should that be '(?<![\\\\])', zero-width negative look-behind assertion? Otherwise I can't see it in perlre. So the regex will match an unescaped # or a literal \ followed by newline, yeah? Ok, I think I get it. > It looks like the problem is that the qr() in $PRD::skip gets stringified, > and then evalled. This is a diasasterous circumstance as the eval destroys > all of the benefit of using qr(), and is responsible for (?!<\\) becoming > (?!<\) which causes all kinds of problems. Even worse is that \\\s* becomes > \\s* which when evalled becomes \s* which of course will never match "\\ " > (or shouldnt, im not so sure about what Damian is doing here.) I had problems using comments and whitespace in regexs too, even though they started with '(?x-ism:' > This applies whether or not you use qr() or another form of quoting. The > solution is as I did above, to use [\\\\] instead of \\ this is ok because > /[\\\\]/ matches the same thing as /\\/ but when it gets double interpolated > it becomes /[\\]/ which of course is also the same as > /\\/ Yeah, icky. I guess I didn't quite go far enough with my escaping. > Why? The grammar say to first compare "yellow" against 'global_lines', and > to accept that it wont match. It doesnt (as it does not begin with "foo" or > "bar" or "burger","chips","pizza"), so the 'global_lines' rule is satisfied > and it goes on to match "yellow" against 'backup'. 'backup' requires that > the string starts with the literal "backup" which certainly doesn't match, > so it complains of the fact, quite correctly. That and the code problem are due to posting after 12 hours in work, without dinner. Should have tested it first, sorry. > Which shows that the subrule 'global_lines' was considered to match (0 > times) which indicates that it will then try to match "yellow" against the > next subrule of 'config' Yup, that's what I see, and why the error can be ignored. That's why I want a <reallycommit> that can't be backtracked past. > Afaict "yellow custard" doesn't ever cause a commit to fire. As such when it > > backup_line: "yellow" <commit> boolean Shouldn't the parser match "yellow", then <commit>, then fail to match boolean, then reach the <error?> ? After spending a few more hours at it, and trying pretty much every possible combination of <error>, <error?>, <reject>, <commit>, etc, I've given up on getting error messages. The parser fails, I can figure out where the error is like so: split the original text, strip comments, split the remaining text, calculate the line number, print a warning and the first line. Now that I've given up and removed the extra error stuff, I'm getting error messages: ERROR (line -96): Invalid boolean: Was expecting /yes|true|on|1/, or /no|false|off|0/ They've still got the negative line numbers though. But not it's not parsing the backup section properly. Oh wait, it is :) Any time you use autoactions, don't forget to put {1} after rules which shouldn't get autoactions, as they have a nasty tendency to fail otherwise. > Hope the reply wasnt to long for anybody either. Hell no, it's great to get some help on this :) > HTH > > yves > ps (If this post doesnt seem as well organized as it could be, its becuase i > changed various bits over time as I explored the issues you have raised. So > sometimes it may say what i though originally only to be followed by a > calrification from further research. Apologies if you find this confusing.) Thanks for the suggestions - I've tried them with some success :) Time to give it a rest and get the rest of the script sorted. -- John Tobin [Parrot] will have reflection, introspection, and Deep Meditative Capabilities. Dan Sugalski, 2002/07/11, [EMAIL PROTECTED]