Re: keyword value(s) newline
Ron D. Smith schrieb: ... Um the technical term for this is hell if I know. If you are irritated by it now, imagine how irritating it gets when the file you are parsing is HUGE and you get the whole thing for each an every attempt... (I modified the PR::D source to truncate the output because of this.) It takes some time to not necessary, just use $::RD_TRACE = 120; # since P::RD version 1.20 Defining $::RD_TRACE causes the parser generator and the parser to report their progress to STDERR in excruciating detail (although, without hints unless $::RD_HINT is separately defined). This detail can be moderated in only one respect: if $::RD_TRACE has an integer value (N) greater than 1, only the N characters of the current parsing context (that is, where in the input string we are at any point in the parse) is reported at any time. Best Regards Charly -- Karl Gaissmaier KIZ/Infrastructure, University of Ulm, Germany Email:[EMAIL PROTECTED] Service Group Network
Re: keyword value(s) newline
On Thursday, May 13, 2004 [EMAIL PROTECTED] said: 1) you did not read the section on skip very carefully. The Skipping between terminals section in the man doesn't tell explicitly that skip: /regex/ is not supported. There is an example skip: qr/[:,]/ and I had been misleaded by that. Um, yeah, well, this is a simple issue of perl syntax. The '/' in the above is not a regular expression delimiter but a quote delimiter. 2) you do not have balanced delimiters in your parse description Could you please tell a bit more, what do you mean here? Simple. The original supplied script had this: push @::value, join ' ', @{$item{'value'}; Notice that this is simply missing a trailing '}' before the ';'. While this seems a simple mistake, I have noticed that PR::D does not tolerate perl syntax errors well and will supply misleading and possibly erroneous error messages. 4) You did not look at the trace results that you printed out very carefully, I do look at the trace, but P::RD is a tough module. That's putting it mildly... The reason for my admonition was simple. The first thing printed out was this: Parse::RecDescent: Treating mmpfile: as a rule declaration Parse::RecDescent: Treating chunk(s) as a one-or-more subrule match Parse::RecDescent: Treating /^\Z/ as a /../ pattern terminal Parse::RecDescent: Treating chunk: as a rule declaration Parse::RecDescent: Treating comment as a subrule match Parse::RecDescent: Treating | as a new production Warning: Undefined (sub)rule comment used in a production. (Hint: Will you be providing this rule later, or did you perhaps misspell comment? Otherwise it will be treated as an immediate reject.) This should have been a Dead Giveaway that something was very wrong. Not only that but it points you directly to the first problem (simply because that particular production with the skip was where PR::D went awry). For example I'm irritated by its rightmost column. Near the top it shows: |c_comment |Didn't match rule | | comment |Didn't match subrule: [c_comment] | | comment |Trying production: [cpp_comment] | | comment | |\nTARGET | | |CNetB.dll\nTARGETTYPE | | |dll\nUID 0x1E5E | | |sdpagent.lib\n\n | comment |Trying subrule: [cpp_comment] | |cpp_commen|Trying rule: [cpp_comment]| |cpp_commen|Trying production: [m{//\s*(.*)}] | |cpp_commen|Trying terminal: [m{//\s*(.*)}] | |cpp_commen|Didn't match terminal | |cpp_commen| |TARGET CNetB.dll\nTARGETTYPE | | |dll\nUID 0x1E5E . | | |sdpagent.lib\n\n Does the rightmost column hold the content of $text? Why does it tell first Trying production and shows \nTARGET, but below it tells Didn't match and shows TARGET without the newline? I would expect it another way around: before the production is tried, the stuff matching $skip is removed isn't it? So it should actually show TARGET, not \nTARGET there. Um the technical term for this is hell if I know. If you are irritated by it now, imagine how irritating it gets when the file you are parsing is HUGE and you get the whole thing for each an every attempt... (I modified the PR::D source to truncate the output because of this.) It takes some time to learn to read the trace dump, but it is well worth the effort. You don't have to like or even completely understand the output for it to be useful. Just watch it progress and when it seems to be misbehaving, study the resulting productions and it will usually come to you quickly. 7) the path delimiter in windoze (you poor soul...) is '\' not '/' I'm porting that mess to Unix :-) So I'll leave the / there. That's great, but your test case had windoze filenames in it... assignment: keyword skip: '[ \t]*' value(s) skip: $item[2] { push @::keyword, $item{keyword}; push @::value, join ' ', @{$item{'value'}}; 1; } Is restoring the $skip = '\s*' above really needed? Well, um, actually, no... And finally my biggest problem right now - the keyword value(s) rule is too greedy and consumes the // added on 01.01.2002 comment, as if it were files: 'LIBRARY' = [ 'euser.lib', 'efsrv.lib', 'c32.lib', '//', 'added',
RE: keyword value(s) newline
Sorry, small typo in my mail - the startrule is actually called mmpfile in my script. So the (non-working) script is: -Original Message- From: ext [mailto:[EMAIL PROTECTED] $parser = Parse::RecDescent-new(q( mmpfile: chunk(s) /^\Z/ chunk: comment | skip: /[ \t]*/ assignment | error comment: c_comment | cpp_comment cpp_comment: m{//([^\n]*)} { push @::cpp_comment, $1; 1; } c_comment: m{/[*](.*?)[*]/}s { push @::c_comment, $1; 1; } assignment: keyword value(s) /\n/ { push @::keyword, $item{keyword}; push @::value, join ' ', @{$item{'value(s)'}; 1; } value: file | type | uid file: m{[\w\\/.-]+} type: /APP/i | /DLL/i uid: /0x[0-9A-F]+/i keyword: /^AIF/im | /^DOCUMENT/im | /^LANG/im | /^LIBRARY/im | /^RESOURCE/im | /^SOURCE/im | /^SOURCEPATH/im | /^SYSTEMINCLUDE/im | /^TARGETPATH/im | /^TARGETTYPE/im | /^TARGET/im | /^UID/im | /^USERINCLUDE/im )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-mmpfile($text) or die 'bad text'; __DATA__ TARGETCNetB.dll TARGETTYPEdll UID 0x1e5e 0x102F43DB SOURCEPATH..\NetBSrc SOURCECNetB.cpp CNetBSerialBase.cpp CNetBBluetoothModule.cpp SOURCECSnakeActiveWrapper.cpp USERINCLUDE ..\NetBInc SYSTEMINCLUDE \Epoc32\include \Epoc32\include\oem LIBRARY euser.lib efsrv.lib c32.lib // added on 01.01.2002 LIBRARY esock.lib bluetooth.lib btdevice.lib btmanclient.lib LIBRARY btextnotifiers.lib sdpagent.lib /* START WINS BASEADDRESS 0x4620 END #if ( (defined ( WINS ) ) || ( defined (WINSCW) ) ) SOURCEPATH ..\SwImp\src SOURCE CApiCamSpecsImpSw.cpp #else SOURCEPATH ..\Mirage1\src SOURCE CApiCamHandlerImpMirage1.cpp #endif */ And the error message is here (for some reason keyword doesn't match): |assignment|Trying subrule: [keyword] | |assignment|Didn't match subrule: [keyword] |
Re: keyword value(s) newline
Oops. I forgot one important thing that I actually did, that I forgot to include in my description. For the sake of completeness, my actual working script is at the very bottom. On Wednesday, May 12, 2004 Ron D. Smith said: There are six problems that you have. 1) you did not read the section on skip very carefully. 2) you do not have balanced delimiters in your parse description 3) the second production in the chunk rule does not eliminate leading newlines 4) You did not look at the trace results that you printed out very carefully, if you had you would have noticed that PR::D was not consuming your *entire* input description. 5) the item hash does not include the modifiers in the name space. 6) /^SOURCE/ is a subset of /^SOURCEPATH/ 7) the path delimiter in windoze (you poor soul...) is '\' not '/' OK, so that's seven, but I didn't expect the Spanish Inquisition. By making these changes I was able to completely parse you test case. On Wednesday, May 12, 2004 [EMAIL PROTECTED] said: Sorry, small typo in my mail - the startrule is actually called mmpfile in my script. So the (non-working) script is: -Original Message- From: ext [mailto:[EMAIL PROTECTED] $parser = Parse::RecDescent-new(q( mmpfile: chunk(s) /^\Z/ chunk: comment | skip: /[ \t]*/ assignment | error chunk: comment | assignment | error comment: c_comment | cpp_comment cpp_comment: m{//([^\n]*)} { push @::cpp_comment, $1; 1; } c_comment: m{/[*](.*?)[*]/}s { push @::c_comment, $1; 1; } assignment: keyword value(s) /\n/ { assignment: keyword skip: '[ \t]*' value(s) /\n/ { #---^ ^--- assignment: keyword skip: '[ \t]*' value(s) skip: $item[2] { push @::keyword, $item{keyword}; push @::value, join ' ', @{$item{'value(s)'}; push @::value, join ' ', @{$item{'value'}}; 1; } value: file | type | uid file: m{[\w\\/.-]+} file: m{[\w\\.-]+} type: /APP/i | /DLL/i uid: /0x[0-9A-F]+/i keyword: /^AIF/im | /^DOCUMENT/im | /^LANG/im | /^LIBRARY/im | /^RESOURCE/im | /^SOURCEPATH/im | /^SOURCE/im | /^SYSTEMINCLUDE/im | /^TARGETPATH/im | /^TARGETTYPE/im | /^TARGET/im | /^UID/im | /^USERINCLUDE/im )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-mmpfile($text) or die 'bad text'; __DATA__ TARGETCNetB.dll TARGETTYPEdll UID 0x1e5e 0x102F43DB SOURCEPATH..\NetBSrc SOURCECNetB.cpp CNetBSerialBase.cpp CNetBBluetoothModule.cpp SOURCECSnakeActiveWrapper.cpp USERINCLUDE ..\NetBInc SYSTEMINCLUDE \Epoc32\include \Epoc32\include\oem LIBRARY euser.lib efsrv.lib c32.lib // added on 01.01.2002 LIBRARY esock.lib bluetooth.lib btdevice.lib btmanclient.lib LIBRARY btextnotifiers.lib sdpagent.lib /* START WINS BASEADDRESS 0x4620 END #if ( (defined ( WINS ) ) || ( defined (WINSCW) ) ) SOURCEPATH ..\SwImp\src SOURCE CApiCamSpecsImpSw.cpp #else SOURCEPATH ..\Mirage1\src SOURCE CApiCamHandlerImpMirage1.cpp #endif */ And the error message is here (for some reason keyword doesn't match): |assignment|Trying subrule: [keyword] | |assignment|Didn't match subrule: [keyword] | -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226 -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226 use strict; use vars qw($parser $text @c_comment @cpp_comment @keyword @value); use Parse::RecDescent; $RD_WARN=1; $RD_HINT=1; $RD_TRACE = 1; $parser = Parse::RecDescent-new(q( file: chunk(s) /^\Z/ chunk: comment |assignment | error comment: c_comment | cpp_comment cpp_comment: m{//([^\n]*)} { push @::cpp_comment, $1; 1; } c_comment: m{/[*](.*?)[*]/}s { push @::c_comment, $1; 1; } assignment: keyword skip: '[ \t]*' value(s) skip: $item[2] { push @::keyword, $item{keyword}; push @::value, join ' ', @{$item{'value'}}; 1; } value: file | type | uid file: m{[\w.-]+} type: /APP/i | /DLL/i uid: /0x[0-9A-F]+/i keyword: /^AIF/im | /^DOCUMENT/im | /^LANG/im | /^LIBRARY/im | /^RESOURCE/im | /^SOURCEPATH/im | /^SOURCE/im | /^SYSTEMINCLUDE/im | /^TARGETPATH/im | /^TARGETTYPE/im | /^TARGET/im | /^UID/im | /^USERINCLUDE/im )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-file($text) or die 'bad text'; __DATA__ TARGET CNetB.dll TARGETTYPE dll UID 0x1e5e 0x102F43DB SOURCEPATH ..\NetBSrc SOURCE