Jayce^ wrote:

actually though, for any who don't know:
In perl regex, there are flags at the end (to be fixed, perl6 has them rightly in the front) that modify your engine. One of these is the underutilized x modifier. When added, it makes whitespace insignificant in your regex, so you can nicely tab/space/newline in order to actually be readable... Oh, and you can/should add comments also. so:

s/(foo|bar)//;

can be
s/
    (        # either
        foo
      |        # or...
        bar
    )
 //x;


very simplified, but the basic point is there. Make things readable, comment, etc....

--
Jayce^

I strongly recommend using this feature to make your REGEX's easier to decypher when you get back to them a few months later. I also advise that you get the REGEX to work first, and then add the comments. I wrote a fairly lengthy REGEX to parse a Synopsys report and found that some of the /special/ characters in my comments were still affecting the interpreter. Wish I could remember which chars gave trouble-- but be aware that NOT everything is treated like a comment. Still a great feature-- just incrementally add the comments to make sure it still works for you. Code from the REGEX looks like:

[report to parse was read into $content var]

while($content =~ m/ # use regex match as key for while loop
     Startpoint:\ (\S+)                     # $1 will be startpoint
       \s+\((.+)\)\n                        # $2 will be desc of startpoint
       \s\sEndpoint:\ (\S+)                 # $3 will be endpoint
       \s+\((.+)\)\n                        # $4 will be desc of endpoint
       \s\sPath\ Group:\ (\S+)\n            # $5 will be pathgroup
       \s\sPath\ Type:\ (\S+)\n             # $6 will be pathtype
       [\s\S]+?                             #
       slack\ \(VIOLATED\)\s+(-\d+.\d\d)\n  # $7 will be slack
       /gx ) {                              #

   $startpoint = $1;
   $start_desc = $2;
   $endpoint   = $3;
   $end_desc   = $4;
   $pathgroup  = $5;
   $pathtype   = $6;
   $slack      = $7;

   ...process data etc...}

probably the /most/ painful REGEX I've written yet, but it was part of a script that allowed me to parse through 100+ reports and condense the info down to something readable.


Justin Gedge


/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Reply via email to