Jayce^ wrote:
actually though, for any who don't know:
In perl regex, there are flags at the end (to be fixed, perl6 has them
rightly in the front) that modify your engine. One of these is the
underutilized x modifier. When added, it makes whitespace
insignificant in your regex, so you can nicely tab/space/newline in
order to actually be readable... Oh, and you can/should add comments
also. so:
s/(foo|bar)//;
can be
s/
( # either
foo
| # or...
bar
)
//x;
very simplified, but the basic point is there. Make things readable,
comment, etc....
--
Jayce^
I strongly recommend using this feature to make your REGEX's easier to
decypher when you get back to them a few months later. I also advise
that you get the REGEX to work first, and then add the comments. I
wrote a fairly lengthy REGEX to parse a Synopsys report and found that
some of the /special/ characters in my comments were still affecting the
interpreter. Wish I could remember which chars gave trouble-- but be
aware that NOT everything is treated like a comment. Still a great
feature-- just incrementally add the comments to make sure it still
works for you. Code from the REGEX looks like:
[report to parse was read into $content var]
while($content =~ m/ # use regex match as key
for while loop
Startpoint:\ (\S+) # $1 will be startpoint
\s+\((.+)\)\n # $2 will be desc of startpoint
\s\sEndpoint:\ (\S+) # $3 will be endpoint
\s+\((.+)\)\n # $4 will be desc of endpoint
\s\sPath\ Group:\ (\S+)\n # $5 will be pathgroup
\s\sPath\ Type:\ (\S+)\n # $6 will be pathtype
[\s\S]+? #
slack\ \(VIOLATED\)\s+(-\d+.\d\d)\n # $7 will be slack
/gx ) { #
$startpoint = $1;
$start_desc = $2;
$endpoint = $3;
$end_desc = $4;
$pathgroup = $5;
$pathtype = $6;
$slack = $7;
...process data etc...}
probably the /most/ painful REGEX I've written yet, but it was part of a
script that allowed me to parse through 100+ reports and condense the
info down to something readable.
Justin Gedge
/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/