Hi there, On Tue, 29 Jun 2021, Scott Q. via clamav-users wrote:
Lately I am receiving a lot of Spams originating within MS networks
I feel your pain. At present I'm seeing 40,000 to 50,000 attempts per month by Microsoft servers to send us spam. It's gone from really bad to almost unbelievable in the space of just a few weeks. When it was only a thousand or so I decided we'd live with it, but now the only answer has been to blacklist AS8075 entirely and forward it all to the spam reporting services. I'm starting to see some results from that. Having said that I'm not seeing the same sorts of thing that you are, if you'd like to send me a sample privately I'll happily look at it.
with attached PDF's that basically contain an image with a link. The body of the message is 7-8 random words such as: moka bu fyno da zosi ku xiqy zy These prove particularly difficult to filter and I'm thinking maybe running the PDF's links through the phishing checks might help. Is that possible or does anyone have other solutions for these messages ?
Steve at Sansecurity might be able to come up with something if you submit a few samples to him. For things like this I don't rely entirely on ClamAV and signatures, but on a milter which dismantles the MIME parts and passes them to clamd separately with a bit of extra logic. Without something like that you'll probably need to do a bit more work on the matching, as you'll have to work with the whole message body and it might be big. It should be possible to match the body with Yara rules, you might get somewhere with a fairly simple regex along the lines of matching the header parts enclosing the short text with one expression and the text itself with another expression. This is just a guess at the sort of thing which might work, adjust the character ranges to suit the spam. Just put this in a file called something.yar in the ClamAV database directory and restart clamd (I'm assuming you're using clamav-milter and clamd). rule Microsoft_spam { strings: $body_1 = /content-type.{10,500}content-type.{10,100}application\/pdf/ nocase ascii $body_2 = /content-type: text\/plain.{20,70}(([a-z]{1,6})\s){6,8}/ nocase ascii conditions: all of them } The first regex matches the bit of the MIME-formatted message which contains header of the first part, the first body part, and just the header of the second part. I've assumed that the text precedes the PDF part, it's usually that way but you'd have to tweak it if that's not the case. The second regex matches the first header (again) and something resembling 6 to 8 space-separated words of 1-6 alphabetic characters. There are 20-70 characters of wiggle-room betweeb the content-type field and this group of words to allow for the rest of the first header after the content-type field. Again it might be necessary to adjust that, but you'll probably find that the messages aren't very creative and once it's set up it will match all of the little blighters. You could do much the same sort of thing with ClamAV signatures but for this kind of thing Yara rules are a lot more readable and much easier to tweak when you're experimenting. The one drawback at the moment is that it's fairly easy to crash clamd with bad Yara rules. On the bright side it seems OK with complex regexes and it's unlikely that a crash would be exploitable, as it seems to crash as soon as it tries to parse the bad rules rather than waiting until it comes across a malicious bit of data. It's important to avoid running into efficiency issues by having the regexes attempt (and eventually fail) to match large chunks of what is potentially a very large document many times over. I don't know how well the untested attempts above will achieve that. HTH -- 73, Ged. _______________________________________________ clamav-users mailing list clamav-users@lists.clamav.net https://lists.clamav.net/mailman/listinfo/clamav-users Help us build a comprehensive ClamAV guide: https://github.com/vrtadmin/clamav-faq http://www.clamav.net/contact.html#ml