On Thursday, July 24, 2003, at 06:38 PM, Mark Brownell wrote:
Is there a way to create a matchChunk regex that picks up all instances within a document that may contain several of the same > items?
OK probably! But I don't know how to do it. Some thoughts...
In the PCRE manual there are examples of regexes for repeated patterns, subpatterns, recursive subpatterns, subroutine callouts, and so forth. Very powerful stuff. Some of which is available to us, and some that is not. I think anything that refers to "you get this back from the C function blah blah" the C API is not available to us.
What you describe above, "regex that picks up all instances within a document", in my experience is a difficult way to approach the problem. There are two main dimensions to consider in designing your regex.
1) The first dimension is how you break the document up into parts to feed it into your pattern match, probably in a repeat structure. Each iteration feeding a new string to matchText or matchChunk. Where this string comes from, what length it is and what it's delimiters are is going to really depend on and be complementary to the next dimension. The fit between these two dimensions is really the art of designing regular expressions.
2) The second dimension is the width of a single match of the pattern. It's possible to write one regex to match a many-line xml document, but it would to be very complex and very unreliable unless you get it exactly perfect. That's not the way I would approach it. Narrow down the match to a small part of the document (single node or element).
You do have some flexibility in the width of the this dimension though because you can have repeated patterns in the regex, and can capture multiple parts out of the match using parens () in your pattern. Unfortunately matchText and matchChunk do not take an array for their match variables (foundVarsList and positionVarsList). However you could match many things in your pattern like matchText(tStr, "()()()...()", t1,t2,t3...tn) probably limited only by the max number of parameters in transcript function call, if there is a max.
Another flexibility in this dimension is the topicality of the pattern match. The modifiers (?smx) can be used to adjust whether a single line is matched, multiple lines, how whitespace is handled and how the "." (any-character) is handled. So the topicality relates in a way back to the 1st dimension, how are you feeding the data into the pattern match functions in the first place.
Regular Expressions are extremely powerful, but also pretty darn confusing when you look at the more advanced usages. Hopefully not any more confusing now :-)
I used to program in Perl a lot. A lot of things about Perl suck, but it's regular expressions capabilities are great. We are fortunate to have this PCRE engine in RR now.
Alex Rice, Software Developer Architectural Research Consultants, Inc. http://ARCplanning.com
_______________________________________________ use-revolution mailing list [EMAIL PROTECTED] http://lists.runrev.com/mailman/listinfo/use-revolution