On Sat, Jan 26, 2013 at 06:16:14PM -0800, Jim Gibson wrote: > Better add periods to that regular expression character class: > > if( $link =~ /mailto:([\w@.]+)/ ) { > > … or include everything up to but not including the second double-quote: > > if( $link =~ /"mailto:([^"]+)/ ) {
I've never used HTML::TreeBuilder::XPath, but I highly doubt that the attr method would return the quotes (and if it did, they could be single-quotes instead). It would probably be best to find a module that knows how to properly parse mailto URIs, but failing that I think that matching everything *from the beginning*[1] up to a literal '?' should suffice. [1] You may wish to tolerate leading white space too, but I'm not sure if that is valid. if($link =~ /^mailto:([^\?]+)/) { my $email = $1; ... Untested, but can't *possibly* fail. ;) Regards, -- Brandon McCaig <bamcc...@gmail.com> <bamcc...@castopulence.org> Castopulence Software <https://www.castopulence.org/> Blog <http://www.bamccaig.com/> perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }. q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.}; tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say'
signature.asc
Description: Digital signature