On Sat, Jan 26, 2013 at 06:16:14PM -0800, Jim Gibson wrote:
> Better add periods to that regular expression character class:
> 
>   if( $link =~ /mailto:([\w@.]+)/ ) {
> 
> … or include everything up to but not including the second double-quote:
> 
>   if( $link =~ /"mailto:([^"]+)/ ) {

I've never used HTML::TreeBuilder::XPath, but I highly doubt that
the attr method would return the quotes (and if it did, they
could be single-quotes instead). It would probably be best to
find a module that knows how to properly parse mailto URIs, but
failing that I think that matching everything *from the
beginning*[1] up to a literal '?' should suffice.

[1] You may wish to tolerate leading white space too, but I'm not
sure if that is valid.

if($link =~ /^mailto:([^\?]+)/) {
    my $email = $1;
    ...

Untested, but can't *possibly* fail. ;)

Regards,

-- 
Brandon McCaig <bamcc...@gmail.com> <bamcc...@castopulence.org>
Castopulence Software <https://www.castopulence.org/>
Blog <http://www.bamccaig.com/>
perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }.
q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.};
tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say'

Attachment: signature.asc
Description: Digital signature

Reply via email to