keyword value(s) newline
Hi, I'm struggling since few days, trying to parse text files, which contain C and C++ comments, some #if/#else/#endif statements and for the most part assignments happening on 1 line. The keyword in the beginning of the line is followed by space separated values. I feel that I've done my homework by reading the FAQ and man P::RD and also searching the archives, but I must admit, that I had to omit some parts of the documentation I've read, since it was to difficult for me to grok. My script is below, why doesn't it parse? I tried moving skip: /[ \t]*/ from the chunk rule to: assignment: skip: /[ \t]*/ keyword value(s) /\n/ but it didn't help... 2 additional questions: can I use the $1, $2 and so on captured in the regexes, as for example here: c_comment: m{/[*](.*?)[*]/}s { push @::c_comment, $1; } Also, is there a way to specify the case-insensitive keyword terminals, without using regexes like /^SOURCE/im ? I wonder also if I should have used /^SOURCE$/im there... #!/usr/bin/perl -w use strict; use vars qw($parser $text @c_comment @cpp_comment @keyword @value); use Parse::RecDescent; $RD_WARN=1; $RD_HINT=1; $RD_TRACE = 1; $parser = Parse::RecDescent-new(q( file: chunk(s) /^\Z/ chunk: comment | skip: /[ \t]*/ assignment | error comment: c_comment | cpp_comment cpp_comment: m{//([^\n]*)} { push @::cpp_comment, $1; 1; } c_comment: m{/[*](.*?)[*]/}s { push @::c_comment, $1; 1; } assignment: keyword value(s) /\n/ { push @::keyword, $item{keyword}; push @::value, join ' ', @{$item{'value(s)'}; 1; } value: file | type | uid file: m{[\w\\/.-]+} type: /APP/i | /DLL/i uid: /0x[0-9A-F]+/i keyword: /^AIF/im | /^DOCUMENT/im | /^LANG/im | /^LIBRARY/im | /^RESOURCE/im | /^SOURCE/im | /^SOURCEPATH/im | /^SYSTEMINCLUDE/im | /^TARGETPATH/im | /^TARGETTYPE/im | /^TARGET/im | /^UID/im | /^USERINCLUDE/im )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-file($text) or die 'bad text'; __DATA__ TARGET CNetB.dll TARGETTYPE dll UID 0x1e5e 0x102F43DB SOURCEPATH ..\NetBSrc SOURCE CNetB.cpp CNetBSerialBase.cpp CNetBBluetoothModule.cpp SOURCE CSnakeActiveWrapper.cpp USERINCLUDE ..\NetBInc SYSTEMINCLUDE \Epoc32\include \Epoc32\include\oem LIBRARY euser.lib efsrv.lib c32.lib // added on 01.01.2002 LIBRARY esock.lib bluetooth.lib btdevice.lib btmanclient.lib LIBRARY btextnotifiers.lib sdpagent.lib /* START WINS BASEADDRESS 0x4620 END #if ( (defined ( WINS ) ) || ( defined (WINSCW) ) ) SOURCEPATH ..\SwImp\src SOURCE CApiCamSpecsImpSw.cpp #else SOURCEPATH ..\Mirage1\src SOURCE CApiCamHandlerImpMirage1.cpp #endif */
RE: keyword value(s) newline
Sorry, small typo in my mail - the startrule is actually called mmpfile in my script. So the (non-working) script is: -Original Message- From: ext [mailto:[EMAIL PROTECTED] $parser = Parse::RecDescent-new(q( mmpfile: chunk(s) /^\Z/ chunk: comment | skip: /[ \t]*/ assignment | error comment: c_comment | cpp_comment cpp_comment: m{//([^\n]*)} { push @::cpp_comment, $1; 1; } c_comment: m{/[*](.*?)[*]/}s { push @::c_comment, $1; 1; } assignment: keyword value(s) /\n/ { push @::keyword, $item{keyword}; push @::value, join ' ', @{$item{'value(s)'}; 1; } value: file | type | uid file: m{[\w\\/.-]+} type: /APP/i | /DLL/i uid: /0x[0-9A-F]+/i keyword: /^AIF/im | /^DOCUMENT/im | /^LANG/im | /^LIBRARY/im | /^RESOURCE/im | /^SOURCE/im | /^SOURCEPATH/im | /^SYSTEMINCLUDE/im | /^TARGETPATH/im | /^TARGETTYPE/im | /^TARGET/im | /^UID/im | /^USERINCLUDE/im )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-mmpfile($text) or die 'bad text'; __DATA__ TARGETCNetB.dll TARGETTYPEdll UID 0x1e5e 0x102F43DB SOURCEPATH..\NetBSrc SOURCECNetB.cpp CNetBSerialBase.cpp CNetBBluetoothModule.cpp SOURCECSnakeActiveWrapper.cpp USERINCLUDE ..\NetBInc SYSTEMINCLUDE \Epoc32\include \Epoc32\include\oem LIBRARY euser.lib efsrv.lib c32.lib // added on 01.01.2002 LIBRARY esock.lib bluetooth.lib btdevice.lib btmanclient.lib LIBRARY btextnotifiers.lib sdpagent.lib /* START WINS BASEADDRESS 0x4620 END #if ( (defined ( WINS ) ) || ( defined (WINSCW) ) ) SOURCEPATH ..\SwImp\src SOURCE CApiCamSpecsImpSw.cpp #else SOURCEPATH ..\Mirage1\src SOURCE CApiCamHandlerImpMirage1.cpp #endif */ And the error message is here (for some reason keyword doesn't match): |assignment|Trying subrule: [keyword] | |assignment|Didn't match subrule: [keyword] |
Re: keyword value(s) newline
Oops. I forgot one important thing that I actually did, that I forgot to include in my description. For the sake of completeness, my actual working script is at the very bottom. On Wednesday, May 12, 2004 Ron D. Smith said: There are six problems that you have. 1) you did not read the section on skip very carefully. 2) you do not have balanced delimiters in your parse description 3) the second production in the chunk rule does not eliminate leading newlines 4) You did not look at the trace results that you printed out very carefully, if you had you would have noticed that PR::D was not consuming your *entire* input description. 5) the item hash does not include the modifiers in the name space. 6) /^SOURCE/ is a subset of /^SOURCEPATH/ 7) the path delimiter in windoze (you poor soul...) is '\' not '/' OK, so that's seven, but I didn't expect the Spanish Inquisition. By making these changes I was able to completely parse you test case. On Wednesday, May 12, 2004 [EMAIL PROTECTED] said: Sorry, small typo in my mail - the startrule is actually called mmpfile in my script. So the (non-working) script is: -Original Message- From: ext [mailto:[EMAIL PROTECTED] $parser = Parse::RecDescent-new(q( mmpfile: chunk(s) /^\Z/ chunk: comment | skip: /[ \t]*/ assignment | error chunk: comment | assignment | error comment: c_comment | cpp_comment cpp_comment: m{//([^\n]*)} { push @::cpp_comment, $1; 1; } c_comment: m{/[*](.*?)[*]/}s { push @::c_comment, $1; 1; } assignment: keyword value(s) /\n/ { assignment: keyword skip: '[ \t]*' value(s) /\n/ { #---^ ^--- assignment: keyword skip: '[ \t]*' value(s) skip: $item[2] { push @::keyword, $item{keyword}; push @::value, join ' ', @{$item{'value(s)'}; push @::value, join ' ', @{$item{'value'}}; 1; } value: file | type | uid file: m{[\w\\/.-]+} file: m{[\w\\.-]+} type: /APP/i | /DLL/i uid: /0x[0-9A-F]+/i keyword: /^AIF/im | /^DOCUMENT/im | /^LANG/im | /^LIBRARY/im | /^RESOURCE/im | /^SOURCEPATH/im | /^SOURCE/im | /^SYSTEMINCLUDE/im | /^TARGETPATH/im | /^TARGETTYPE/im | /^TARGET/im | /^UID/im | /^USERINCLUDE/im )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-mmpfile($text) or die 'bad text'; __DATA__ TARGETCNetB.dll TARGETTYPEdll UID 0x1e5e 0x102F43DB SOURCEPATH..\NetBSrc SOURCECNetB.cpp CNetBSerialBase.cpp CNetBBluetoothModule.cpp SOURCECSnakeActiveWrapper.cpp USERINCLUDE ..\NetBInc SYSTEMINCLUDE \Epoc32\include \Epoc32\include\oem LIBRARY euser.lib efsrv.lib c32.lib // added on 01.01.2002 LIBRARY esock.lib bluetooth.lib btdevice.lib btmanclient.lib LIBRARY btextnotifiers.lib sdpagent.lib /* START WINS BASEADDRESS 0x4620 END #if ( (defined ( WINS ) ) || ( defined (WINSCW) ) ) SOURCEPATH ..\SwImp\src SOURCE CApiCamSpecsImpSw.cpp #else SOURCEPATH ..\Mirage1\src SOURCE CApiCamHandlerImpMirage1.cpp #endif */ And the error message is here (for some reason keyword doesn't match): |assignment|Trying subrule: [keyword] | |assignment|Didn't match subrule: [keyword] | -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226 -- Intel, Corp. 5000 W. Chandler Blvd. Chandler, AZ 85226 use strict; use vars qw($parser $text @c_comment @cpp_comment @keyword @value); use Parse::RecDescent; $RD_WARN=1; $RD_HINT=1; $RD_TRACE = 1; $parser = Parse::RecDescent-new(q( file: chunk(s) /^\Z/ chunk: comment |assignment | error comment: c_comment | cpp_comment cpp_comment: m{//([^\n]*)} { push @::cpp_comment, $1; 1; } c_comment: m{/[*](.*?)[*]/}s { push @::c_comment, $1; 1; } assignment: keyword skip: '[ \t]*' value(s) skip: $item[2] { push @::keyword, $item{keyword}; push @::value, join ' ', @{$item{'value'}}; 1; } value: file | type | uid file: m{[\w.-]+} type: /APP/i | /DLL/i uid: /0x[0-9A-F]+/i keyword: /^AIF/im | /^DOCUMENT/im | /^LANG/im | /^LIBRARY/im | /^RESOURCE/im | /^SOURCEPATH/im | /^SOURCE/im | /^SYSTEMINCLUDE/im | /^TARGETPATH/im | /^TARGETTYPE/im | /^TARGET/im | /^UID/im | /^USERINCLUDE/im )) or die 'Bad grammar'; $text .= $_ while (DATA); defined $parser-file($text) or die 'bad text'; __DATA__ TARGET CNetB.dll TARGETTYPE dll UID 0x1e5e 0x102F43DB SOURCEPATH ..\NetBSrc SOURCE