On 14/08/07, Peter Gordon <[EMAIL PROTECTED]> wrote:
> Hi.
>
> Let's suppose that I have the following lines in an HTML file.
> I want to substitute the spaces in the date part with non-breaking spaces
> ( )
>
> <td style="text-align: left" bgcolor="#92c1bb">Aug 12 23:59:59 2007 GMT</td>
> <td style="text-align: left" bgcolor="#92c1bb">Aug 12 23:59:59 2007 GMT</td>
>
> I came up with this line - but somehow it isn't aesthetic.
>
> s!(<td.*?>)(.*?)(</td>)!my $t1 = $1 ;my $t2 = $2 ; my $t3 = $3 ; $t2 =~
> s/\s/ /g ; "$t1$t2$t3" ;!egs ;
>
> Is there a nicer/cleaner way to write it?
It's a clever way, but i am very much into "Perl Best Practices"
lately, which says "Don't be clever" :)
It's very TMTOWTDI, of course.
I thought of a different regex for this, without the /e . I thought of
using lookbehind assertions, something like (?<= .*>), but apparently
variable length lookbehind assertions are not implemented.
I could also recommend HTML::Parser, but if all you need is replacing
some spaces, then it would be overkill.
So your algorithm is OK, but you don't need the outer s/// at all, and
if you do use it, then you don't need the outer /g , because the first
part of the outer s/// is used only for capturing the HTML.
I would write the same algorithm more readably and simply like this:
if ($str =~ m{
(<td.*?>)
(.*?)
(</td>)
}xms)
{
my ($t1, $t2, $t3) = ($1, $2, $3);
$t2 =~ s/\s/ /g;
$str = "$t1$t2$t3";
}
else {
print "expected HTML not found\n";
}
_______________________________________________
Perl mailing list
[email protected]
http://perl.org.il/mailman/listinfo/perl