Peter Gordon wrote:
> Wouldn't it be cute and much more intuitive if we could write this?
>
> s!<td.*?>(.*?)</td>!$1 =~ s/\s/ /g!eg
>
Yes. It wouldn't.
These variables are read-only for very good reasons, y'know :-)
Also notice the regex you just wrote doesn't do what you originally
asked for.
Assuming you wouldn't get errors for modifying a read-only variable,
you're completely deleting <td> and </td>.
I hope you don't mean you want the current behaviour that we all know
and love to "match-but-do-not-replace-anything-that's-not-grouped" :-)
So basically we:
- *Have* to tell the engine what we want to "match-but-not-replace", as
opposed to the current "match-and-replace"
- *Really* want $1 to be read-only (think of you debugging experience
when some function somewhere modifies $1 :))
The cleanest solution would probably be (works starting from perl 5.9.5):
$str =~ s{<td.*?> # variable-length look-behind :)
\K # tells Perl to "keep" that <td>
(.+?) # text that we want to s///
(?=</td>) # look-ahead; won't be replaced
}{htmlify()}egx;
sub htmlify {local $_=$1; s/\s/ /g; $_}
I think the substitution should have occurred in a subroutine anyway,
even if $1 was writable.
Maybe it would be cool if look-ahead had a backslash-thingie notation.
Erm...
And you might like this one; I personally hate it:
my $text;
$str =~ s{<td.*?> \K
(.+?)
(?{ $text = $^N; $text =~ s/\s/ /g; })
(?=</td>)
}{$text}egx;
If you have any better ideas of how it could/should look (with the
constraints I've mentioned, or a way to break them without ruining every
Perl programmer's life by breaking the current behaviour) please share
them :-)
HTH,
~Y
_______________________________________________
Perl mailing list
[email protected]
http://perl.org.il/mailman/listinfo/perl