On Thursday, May 13, 2004 [EMAIL PROTECTED] said: > > > 1) you did not read the section on "skip" very carefully. > > The "Skipping between terminals" section in the man doesn't tell > explicitly that <skip: /regex/> is not supported. There is an example > > <skip: qr/[:,]/> > > and I had been misleaded by that.
Um, yeah, well, this is a simple issue of perl syntax. The '/' in the above is not a regular expression delimiter but a quote delimiter. > > > > 2) you do not have balanced delimiters in your parse description > > Could you please tell a bit more, what do you mean here? > Simple. The original supplied script had this: push @::value, join ' ', @{$item{'value'}; Notice that this is simply missing a trailing '}' before the ';'. While this seems "a simple mistake", I have noticed that PR::D does not tolerate perl syntax errors "well" and will supply misleading and possibly erroneous error messages. > > > 4) You did not look at the trace results that you printed > > out very carefully, > > I do look at the trace, but P::RD is a tough module. That's putting it mildly... The reason for my admonition was simple. The first thing printed out was this: Parse::RecDescent: Treating "mmpfile:" as a rule declaration Parse::RecDescent: Treating "chunk(s)" as a one-or-more subrule match Parse::RecDescent: Treating "/^\Z/" as a /../ pattern terminal Parse::RecDescent: Treating "chunk:" as a rule declaration Parse::RecDescent: Treating "comment" as a subrule match Parse::RecDescent: Treating "|" as a new production Warning: Undefined (sub)rule "comment" used in a production. (Hint: Will you be providing this rule later, or did you perhaps misspell "comment"? Otherwise it will be treated as an immediate <reject>.) This should have been a Dead Giveaway that something was very wrong. Not only that but it points you directly to the first problem (simply because that particular production with the "skip" was where PR::D went awry). > For example > I'm irritated by its rightmost column. Near the top it shows: > > |c_comment |<<Didn't match rule>> | > | comment |<<Didn't match subrule: [c_comment]>> | > | comment |Trying production: [cpp_comment] | > | comment | |"\nTARGET > | | |CNetB.dll\nTARGETTYPE > | | |dll\nUID 0x10000E5E > > .... > | | |sdpagent.lib\n\n" > | comment |Trying subrule: [cpp_comment] | > |cpp_commen|Trying rule: [cpp_comment] | > |cpp_commen|Trying production: [m{//\s*(.*)}] | > |cpp_commen|Trying terminal: [m{//\s*(.*)}] | > |cpp_commen|<<Didn't match terminal>> | > |cpp_commen| |"TARGET CNetB.dll\nTARGETTYPE > | | |dll\nUID 0x10000E5E > > ..... > | | |sdpagent.lib\n\n" > > Does the rightmost column hold the content of $text? Why does it > tell first "Trying production" and shows "\nTARGET", but below > it tells "Didn't match" and shows "TARGET" without the newline? > > I would expect it another way around: before the production > is tried, the stuff matching $skip is removed isn't it? > So it should actually show "TARGET", not "\nTARGET" there. Um the technical term for this is "hell if I know". If you are irritated by it now, imagine how irritating it gets when the file you are parsing is HUGE and you get the whole thing for each an every attempt... (I modified the PR::D source to truncate the output because of this.) It takes some time to learn to read the trace dump, but it is well worth the effort. You don't have to "like" or even completely understand the output for it to be useful. Just watch it progress and when it seems to be misbehaving, study the resulting productions and it will usually come to you quickly. > > > 7) the path delimiter in windoze (you poor soul...) is '\' not '/' > > I'm porting that mess to Unix :-) So I'll leave the / there. That's great, but your test case had windoze filenames in it... > > > assignment: keyword <skip: '[ \t]*'> value(s) <skip: $item[2]> { > > push @::keyword, $item{keyword}; > > push @::value, join ' ', @{$item{'value'}}; > > 1; > > } > > Is restoring the $skip = '\s*' above really needed? Well, um, actually, no... > > And finally my biggest problem right now - the "keyword value(s)" > rule is too greedy and consumes the "// added on 01.01.2002" comment, > as if it were "file"s: > > 'LIBRARY' => [ > 'euser.lib', > 'efsrv.lib', > 'c32.lib', > '//', > 'added', > 'on', > '01.01.2002', > 'esock.lib', > 'bluetooth.lib', > 'btdevice.lib', > 'btmanclient.lib', > 'btextnotifiers.lib', > 'sdpagent.lib' > ], > > How could I prevent it? P.O.C. Use a negative look ahead. value: ...!cpp_comment file | type | uid When I do this I get: $hol = { 'SOURCE' => [ 'CNetB.cpp', 'CNetBSerialBase.cpp', 'CNetBBluetoothModule.cpp', 'CSnakeActiveWrapper.cpp' ], 'SOURCEPATH' => [ '..\\NetBSrc' ], 'USERINCLUDE' => [ '..\\NetBInc' ], 'LIBRARY' => [ 'euser.lib', 'efsrv.lib', 'c32.lib', 'esock.lib', 'bluetooth.lib', 'btdevice.lib', 'btmanclient.lib', 'btextnotifiers.lib', 'sdpagent.lib' ], 'UID' => [ '0x10000E5E', '1028888' ], 'TARGETTYPE' => [ 'dll' ], 'SYSTEMINCLUDE' => [ '\\Epoc32\\include', '\\Epoc32\\include\\oem' ], 'TARGET' => [ 'CNetB.dll' ] }; $c_comment = [ 'START WINS BASEADDRESS 0x46200000 END #if ( (defined ( WINS ) ) || ( defined (WINSCW) ) ) SOURCEPATH ..\\SwImp\\src SOURCE CApiCamSpecsImpSw.cpp #else SOURCEPATH ..\\Mirage1\\src SOURCE CApiCamHandlerImpMirage1.cpp #endif ' ]; $cpp_comment = [ 'added on 01.01.2002' ]; > > Regards > Alex > > #!/nokia/apps/tww/@sys/bin/perl -w > > use strict; > use vars qw($parser $text @c_comment @cpp_comment %hol); > use Data::Dumper; > use Parse::RecDescent; > $RD_WARN = 1; > $RD_HINT = 1; > $RD_TRACE = 1; > > $parser = Parse::RecDescent->new(q( > > mmpfile: chunk(s) /^\Z/ > chunk: comment | assignment | <error> > comment: c_comment | cpp_comment > > c_comment: m{/[*]\s*(.*?)[*]/}s { > push @::c_comment, $1; > } > > cpp_comment: m{//\s*(.*)} { > push @::cpp_comment, $1; > } > > assignment: keyword <skip: '[ \t]*'> value(s) { > # add values to the hash of lists, with key = keyword > push @{$::hol{uc $item{keyword}}}, @{$item{value}}; > } > > value: file | type | uid > file: m{[\w\\\\/.-]+} > type: /APP/i | /DLL/i > > uid: /0x[0-9A-F]+/i | /\d+/ { > # positive hex or decimal number > $item[1]; > } > > keyword: > /^LIBRARY/im | > /^SOURCEPATH/im | > /^SOURCE/im | > /^SYSTEMINCLUDE/im | > /^TARGETPATH/im | > /^TARGETTYPE/im | > /^TARGET/im | > /^UID/im | > /^USERINCLUDE/im > > )) or die 'Bad grammar'; > $text .= $_ while (<DATA>); > defined $parser->mmpfile($text) or die 'bad text'; > print Data::Dumper->Dump([\%hol, > [EMAIL PROTECTED], > [EMAIL PROTECTED], > [qw(hol c_comment cpp_comment)]); > __DATA__ > > TARGET CNetB.dll > TARGETTYPE dll > UID 0x10000E5E 1028888 > > SOURCEPATH ..\NetBSrc > SOURCE CNetB.cpp CNetBSerialBase.cpp CNetBBluetoothModule.cpp > SOURCE CSnakeActiveWrapper.cpp > > USERINCLUDE ..\NetBInc > SYSTEMINCLUDE \Epoc32\include \Epoc32\include\oem > > LIBRARY euser.lib efsrv.lib c32.lib // added on 01.01.2002 > LIBRARY esock.lib bluetooth.lib btdevice.lib btmanclient.lib > LIBRARY btextnotifiers.lib sdpagent.lib > -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226 -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226