Scott Johnson wrote:
> So to start out I wrote this RegEx which I thought would match the
> capitalized text and the two new line feeds and then I was going to replace
> the match with just the capitalized text. This was my first stab at it:
> s/\([A-Z]{2,}\)*\n\n/<$1>/g
If you're looking to do as you've said, perhaps a more 'correct' regex
might be:
s/(\([A-Z]{2,}\))\n\n/$1/g; # Notice the ()'s wrapping your escaped ()'s
You didn't wrap your '(UG)' detection string in parens for
back-referencing so $1 was confused as to what you wished to
back-reference. And regexes are greedy so in a match such as this you
must be _very_ explicit. Your '*' which means 'match zero or more of the
previous character' from what I can see didn't have a 'previous
character' to match (since you used '{2,}' as your quantifer for the
'(UG)' match). This isn't a well formed regex and so the behaviour might
rightly be quite unpredictable.
> Yeah, kinda ignorantly structured, but it was my first attempt. To just see
> what would be matched without the substitution I simplified it to just this:
>
> [A-Z]{2,}*\n\n
Again, the '*' seems out of place since you're already using the '{2,}'
quantifier to restrict your match. I'd say this was your problem.
hth
- jc
-------------------------------------------------
James Diggans Phone: 301.987.1756
Gene Logic, Inc. FAX: 301.987.1701
[EMAIL PROTECTED] Cell: 301.908.2477
-------------------------------------------------
_______________________________________________
ActivePerl mailing list
[EMAIL PROTECTED]
http://listserv.ActiveState.com/mailman/listinfo/activeperl