[Bug 6582] need to have some limits on scanned data size

bugzilla-daemon Wed, 19 Sep 2018 02:01:50 -0700

https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6582


--- Comment #22 from Henrik Krohns <[email protected]> ---
Ok I thought about this for a while, I decided to implement more simply in
Message.pm instead of Node.pm.

Also truncate is done on nearest whitespace boundary. My quick hack before
didn't do anything for rawbody, I implemented own setting for it here.

body_part_scan_size 50000  (affects get_decoded_stripped_body_text_array)
rawbody_part_scan_size 500000  (affects get_decoded_body_text_array)

Most important config is for body, since running a large CSV attachments or
such makes things really slow. There are much less rawbody rules, from few
quick tests having 100k-500k doesn't really affect things much. I guess some
cap is still good for pathological cases. It would be nice if someone having
corpus of large messages could mass-check different settings.

Notice that the setting is per mime part, so all textual parts are always
scanned no matter what, just truncated. This means if there are 20 text/plain
parts, then all are processed (20 x 50k truncated = 1MB of text to process). I
think it's easier to keep the logic simple and not implement some vague
absolute scan limits (would we ignore parts? or extra truncate all equally?).

Here's a quick test of ~4MB mail with 8 x 500k csv parts.

limit, runtime
unlimited 70s
100k 17s
50k 8s
20k 5s

I think 50k is decend setting, would need a very large message with lots of
parts to "DoS".

Please try mass-checks with different settings and compare logs.

Sending        lib/Mail/SpamAssassin/Conf.pm
Sending        lib/Mail/SpamAssassin/Message.pm
Sending        lib/Mail/SpamAssassin.pm
Transmitting file data ...done
Committing transaction...
Committed revision 1841304.

-- 
You are receiving this mail because:
You are the assignee for the bug.

[Bug 6582] need to have some limits on scanned data size

Reply via email to