Here's something a bit simpler based on the original example Barry sent. Basically looks for a single upper case letter with a single non-upper case, non-white space char before it. \w doesn't do that, we also don't need to use the "+" modifier since all we care about is matching a single char. (Better performance if not searching for a variable length string.) perl -we 'my $t="madeStyle\nfacilitatedOne\nAnti-magneticQuality\n123FOO BAR"; $t=~s/([^A-Z\s])([A-Z])/$1. $2/g; print "----------\n$t\n";' ---------- made. Style facilitated. One Anti-magnetic. Quality 123. FOO BAR
Curtis ________________________________ From: activeperl-boun...@listserv.activestate.com [mailto:activeperl-boun...@listserv.activestate.com] On Behalf Of williamawalt...@aol.com Sent: Friday, May 15, 2009 8:55 PM To: ari.constan...@gmail.com Cc: activeperl@listserv.activestate.com Subject: Re: Help with Regular Expression hi ari and barry -- In a message dated 5/15/2009 6:20:40 PM Eastern Standard Time, ari.constan...@gmail.com writes: > On Fri, May 15, 2009 at 11:18 PM, Barry Brevik <bbre...@stellarmicro.com> wrote: > > > I am running Active Perl 5.8.8. > > ... > > Difficulty: the fields contain hundreds of words both preceding and > > following the "bad" words, so I have to be able to pick out the > > lower-case words that contain one embedded upper-case character. > > ... > > Barry Brevik > > Hi Barry, > > Maybe something like this would help: > > $ cat test.txt > madeStyle > facilitatedOne > Anti-magneticQuality > > $ cat test.txt |perl -pe 's/(\w+)([A-Z])/\1\. \2/g' > made. Style > facilitated. One > Anti-magnetic. Quality > > Regards, Ari Constancio the replacement string in a s/// should use capture variables rather than backreferences; perl warns about this if warnings are on (always a good idea). a '.' (period) character in a replacement string is not a metacharacter and needs no escape. also, the regex used, /(\w+)([A-Z])/, will allow any number greater than zero of upper case letters, digits or underscores to precede the uc letter that is supposed to be the initial letter of a new sentence: probably not what is intended. >cat test.txt madeStyle facilitatedOne Anti-magneticQuality 123FOO >cat test.txt | perl -wMstrict -pe "s/(\w+)([A-Z])/\1\. \2/g" \1 better written as $1 at -e line 1. \2 better written as $2 at -e line 1. made. Style facilitated. One Anti-magnetic. Quality 123FO. O a better approach might be something like: >cat test.txt | perl -wMstrict -pe "s{ ([[:lower:]]) ([[:upper:]] [[:lower:]]) }{$1. $2}xmsg" made. Style facilitated. One Anti-magnetic. Quality 123FOO hth -- bill walters ************** Recession-proof vacation ideas. Find free things to do in the U.S. (http://travel.aol.com/travel-ideas/domestic/national-tourism-week?ncid= emlcntustrav00000002)
_______________________________________________ ActivePerl mailing list ActivePerl@listserv.ActiveState.com To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs