On Tue, May 22, 2012 at 06:25:35PM -0400, Ronald J Kimball wrote:
> #!perl -p0
>
> s{(<div class="topic">)(.*?)(</div>)}
> < { ($x = $2) =~ s!([A-Z]+)!\L\u$1!g } "$1$x$3" >sge;
>
> __END__
I'm back! Here's what's going on with this script:
#!perl -p0
-p wraps the script in an implicit loop that reads line-by-line into $_,
executes your code, and then prints $_.
-0 (zero) sets the input record separator to the null character, instead
of "\n", so that the whole file is read in one chunk (assuming it doesn't
contain any null characters), in case the HTML we want to match is split
across multiple lines.
s{(<div class="topic">)(.*?)(</div>)}
< { ($x = $2) =~ s!([A-Z]+)!\L\u$1!g } "$1$x$3" >sge;
This performs a substitution, of course.
The pattern is enclosed in {}, but the replacement in enclosed in <>
because it uses all the other paired delimiters inside.
(<div class="topic">)(.*?)(</div>)
The pattern matches the opening div tag, some text, and the first closing
div tag, and captures those three pieces in $1, $2, and $3.
{ ($x = $2) =~ s!([A-Z]+)!\L\u$1!g } "$1$x$3"
The replacement copies the matched text to $x (because $2 is read-only),
and then performs a substitution on it - this is the loop inside the
loop. It then returns a string containing the values of $1, $x, and $3.
The inner substitution is inside a block so that it doesn't clobber $1
and $2.
s!([A-Z]+)!\L\u$1!g
The inner substitution finds all sequences of uppercase letters, and
converts them to title-case. \L lowercases subsequent characters, and \u
uppercases the first character. (I guess \u\L would be more correct, but
Perl "does what I mean" with \L\u, treating it as \u\L anyway.)
If you wanted to convert something like AbC to Abc, you could change
[A-Z]+ to [A-Z][a-zA-Z]*
sge;
These substitution flags specify single-line mode (. matches newline),
global (find and replace all matches), and eval (execute the replacement
string as Perl code).
Hope that helps!
Ronald
--
You received this message because you are subscribed to the
"BBEdit Talk" discussion group on Google Groups.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
<http://groups.google.com/group/bbedit?hl=en>
If you have a feature request or would like to report a problem,
please email "[email protected]" rather than posting to the group.
Follow @bbedit on Twitter: <http://www.twitter.com/bbedit>