https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8267
Bug ID: 8267 Summary: ExtractText.pm Product: Spamassassin Version: 4.0.1 Hardware: PC OS: Windows 10 Status: NEW Severity: normal Priority: P2 Component: Plugins Assignee: dev@spamassassin.apache.org Reporter: j...@disroot.org Target Milestone: Undefined Hiya! It looks like there's a bug here: --- a/lib/Mail/SpamAssassin/Plugin/ExtractText.pm 2024-03-29 02:00:00.000000000 +0000 +++ b/lib/Mail/SpamAssassin/Plugin/ExtractText.pm 2024-07-06 21:56:00.788596023 +0100 @@ -601,7 +601,7 @@ sub _extract { push @{$coll->{flags}}, 'ActionURI'; dbg("extracttext: ActionURI: $1"); push @{$coll->{text}}, $text; - push @{$coll->{uris}}, $2; + push @{$coll->{uris}}, $1; } elsif($text =~ /QR-Code\:([^\s]*)/) { # zbarimg(1) prefixes the url with "QR-Code:" string my $qrurl = $1; Note that the regex has a "?:" in the first capturing group: if ($text =~ /<a(?:\s+[^>]+)?\s+href="([^">]*)"/) { So, you just have $1. $2 is undef. A side note: You say "This module (ExtractText.pm) uses external tools to extract text from message parts, and then sets the text as the rendered part. **External tool must output plain text**, not HTML or other non-textual result." Though, this code is parsing an html tag for a href attribute... Cheers, jps -- You are receiving this mail because: You are the assignee for the bug.