Here's something a bit simpler based on the original example Barry sent.
Basically looks for a single upper case letter with a single non-upper
case, non-white space char before it.  \w doesn't do that, we also don't
need to use the "+" modifier since all we care about is matching a
single char.  (Better performance if not searching for a variable length
string.)
 
perl -we 'my $t="madeStyle\nfacilitatedOne\nAnti-magneticQuality\n123FOO
BAR";
             $t=~s/([^A-Z\s])([A-Z])/$1. $2/g;
             print "----------\n$t\n";'
----------
made. Style
facilitated. One
Anti-magnetic. Quality
123. FOO BAR
 

Curtis


________________________________

From: activeperl-boun...@listserv.activestate.com
[mailto:activeperl-boun...@listserv.activestate.com] On Behalf Of
williamawalt...@aol.com
Sent: Friday, May 15, 2009 8:55 PM
To: ari.constan...@gmail.com
Cc: activeperl@listserv.activestate.com
Subject: Re: Help with Regular Expression


hi ari and barry --    

In a message dated 5/15/2009 6:20:40 PM Eastern Standard Time,
ari.constan...@gmail.com writes: 

> On Fri, May 15, 2009 at 11:18 PM, Barry Brevik
<bbre...@stellarmicro.com> wrote: 
> 
> > I am running Active Perl 5.8.8. 
> > ... 
> > Difficulty: the fields contain hundreds of words both preceding and 
> > following the "bad" words, so I have to be able to pick out the 
> > lower-case words that contain one embedded upper-case character. 
> > ... 
> > Barry Brevik 
> 
> Hi Barry, 
> 
> Maybe something like this would help: 
> 
> $ cat test.txt 
> madeStyle 
> facilitatedOne 
> Anti-magneticQuality 
> 
> $ cat test.txt |perl -pe 's/(\w+)([A-Z])/\1\. \2/g' 
> made. Style 
> facilitated. One 
> Anti-magnetic. Quality 
> 
> Regards, Ari Constancio 

the replacement string in a  s///  should use capture variables rather 
than backreferences; perl warns about this if warnings are on (always 
a good idea).   a '.' (period) character in a replacement string is not 
a metacharacter and needs no escape.    

also, the regex used, /(\w+)([A-Z])/, will allow any number greater than

zero of upper case letters, digits or underscores to precede the uc
letter 
that is supposed to be the initial letter of a new sentence: probably
not 
what is intended.    

>cat test.txt 
madeStyle 
facilitatedOne 
Anti-magneticQuality 
123FOO 

>cat test.txt | perl -wMstrict -pe 
"s/(\w+)([A-Z])/\1\. \2/g" 
\1 better written as $1 at -e line 1. 
\2 better written as $2 at -e line 1. 
made. Style 
facilitated. One 
Anti-magnetic. Quality 
123FO. O 

a better approach might be something like:    

>cat test.txt | perl -wMstrict -pe 
"s{ ([[:lower:]]) ([[:upper:]] [[:lower:]]) }{$1. $2}xmsg" 
made. Style 
facilitated. One 
Anti-magnetic. Quality 
123FOO 

hth -- bill walters    


**************
Recession-proof vacation ideas. Find free things to do in the U.S.
(http://travel.aol.com/travel-ideas/domestic/national-tourism-week?ncid=
emlcntustrav00000002) 
_______________________________________________
ActivePerl mailing list
ActivePerl@listserv.ActiveState.com
To unsubscribe: http://listserv.ActiveState.com/mailman/mysubs

Reply via email to