https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7176

--- Comment #1 from Mark Martinec <[email protected]> ---
> Sample of real spam:
>  Content-Disposition: attachment; filename= "xyzzy.zip"
> 
> Parser doesn't read the filename since it doesn't expect spaces (Message.pm
> -> _parse_normal):
> [...]
> So any reason not to change the C-D regex identical to C-T?

The formal syntax does not allow whitespace there:

RFC 2045:
  parameter := attribute "=" value
  attribute := token
  value := token / quoted-string
  token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, or tspecials>
  tspecials :=  "(" / ")" / "<" / ">" / "@" /
                "," / ";" / ":" / "\" / <">
                "/" / "[" / "]" / "?" / "="
                ; Must be in quoted-string, to use within parameter values

although I suppose it won't hurt to allow whitespace there.


>   if ($disp =~ /name="?([^\";]+)"?/i) {
>     $msg->{'name'} = $1;
> 
> If we look at the Content-Type parser, it does handle spaces (Util.pm ->
> parse_content_type): 
>   my($name) = $ct =~ /\b(?:file)?name\s*=\s*["']?(.*?)["']?(?:;|$)/i;

Both regexps are sloppy, the first one does not allow 'token' but insists
on a 'quoted-string', the second accepts also single quotes, and moreover
a terminating quote being a different quote from a starting quote.
Should be fixed & unified (and relaxed with allowed space around a '=' ).

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to