Re: [mailop] Junk filtering as a tool for unfair competition

Brandon Long via mailop Wed, 23 Oct 2019 14:50:36 -0700

On Wed, Oct 23, 2019 at 7:00 AM Doug Royer via mailop <mailop@mailop.org>
wrote:

> On 10/22/19 3:36 PM, Daniele via mailop wrote:
> > It looks like Microsoft, with its long history of questionable
> practices, has recently developed a new strategy for tearing down its
> weaker competitors.
>
> Not directly related, but gmail has been putting MANY more false positives
> in the spam folder. I used to get 1-3 per week. Now false positives are
> closer to 60 a day.
>
> And MOST (about 80%) of them have the X-Microsoft headers. On the first
> day 99% had the text portion ONLY base64 encoded, not text/plain alternate.
> Only 1 was DANE related.
>
> It may be people are tweaking with headers. And I think many are tweaking
> their filtering rules to adjust to the changing spam. It used to be that
> 100% of the email I got with ONLY base64 encoded text, was spam at it
> attempted to bypass filters. I am guessing that gmail had noticed a similar
> trend and may be filtering those as spam.
>
> And why does Microsoft need about 60 X-Microsoft headers per email? Maybe
> it is time for the IETF to deprecate X- headers.
>

Just a guess.... at Gmail, once your email is inside our system, we wrap it
in a protocol buffer <https://developers.google.com/protocol-buffers> and
stick information about the message that we learn at various points into
that.   There are hundreds of entries in the proto for messages, and that
doesn't even talk about the "spam features" which number somewhere over
5000 (not all of which are still in use, that's just the enum values at
this point).  The messages transit different servers in our system using
our RPC system (the precursor to GRPC <https://grpc.io>), so all of this
data can be shared out of band to the actual message contents.   For many
sub-systems, they don't even get the full message contents, only the small
parts they need.  We only resort back to SMTP when relaying between systems
or virtual ADMDs... and even then, we're trying to do more to keep things
internal so we can keep the accumulated data.  The headers we add that are
a blackbox externally are for our consumption when mail transits via SMTP,
and any that looks like base64 data is base64 data that's encrypted
(usually a serialized protobuf that's encrypted and then base64 encoded).
At first we did that just so we could somewhat trust the data that came
back, but now it's done by default to avoid any possible privacy issues.

Most smaller systems just move messages around using SMTP or LMTP or
whatever  (POP/IMAP to the client), and there's less room for out of band
information there, so you get headers like Authentication-Results or
various system specific X headers.

If I had to guess, MS uses a system much more like that, or as they've also
pointed out, they have a bunch of different systems acting as one, so they
resort to stuffing the info they need into headers so that the hotmail
system can share with the exchange systems and with the frontbridge systems

Which is fine, who the heck cares how many or what type of X headers they
add.  They aren't for you, are they causing your system issues?  I know we
put in a maximum size of headers at one point to prevent some poor edge
cases in our system, but if the size of standard headers reached that
point... we'd just make the cut-off bigger.

Brandon

_______________________________________________
mailop mailing list
mailop@mailop.org
https://chilli.nosignal.org/cgi-bin/mailman/listinfo/mailop

Re: [mailop] Junk filtering as a tool for unfair competition

Reply via email to