https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7995

--- Comment #15 from Riccardo Alfieri <riccardo.alfi...@spamteq.com> ---
(In reply to Henrik Krohns from comment #14)
> (In reply to Riccardo Alfieri from comment #13)
> > That is exactly what we are doing by normalizing addresses and shipping a
> > plugin that normalize the query on the lookup side.
> 
> That's assuming SpamAssassin is used. Does something like rspamd support the
> same things? Maybe someone does custom queries with mimedefang or something.
> 

Yes, Rspamd does exactly those normalizations, see the functions
"check_gmail_user" and "check_googlemail" in
https://github.com/rspamd/rspamd/blob/master/lualib/lua_util.lua

It does it hardcoded, maybe because gmail is the only provider (that I know of)
that do such weird things with his mailboxes

> Even your plugin can get out of date, how do you make sure that people
> update it? Same problem with SA, some people still use 5 year old versions.
> 

That is why we ensured that our plugin works with SA 3.4.1+, because many
distributions still package that version. We encourage our customers to keep an
eye for updates and try the latest version before reporting a bug. 

> Why wouldn't I expect? Having a million or two million hashes should make no
> meaningful difference to rbldnsd resource usage, or whatever you use to
> serve the lists.

Well, two things. One is that we operate on scale, and distributing lists to
hundreds of mirrors in real time is a bandwidth and time consuming process. If
we can shed a few seconds here and there by keeping the lists as optimized as
possible this is what we are going to do. Second, let's say we observe a
dropbox like "u...@gmail.com" , we would need to store hashes for:

u.s....@gmail.com
us....@gmail.com
u.s...@gmail.com
us...@gmail.com
u.s...@gmail.com
us...@gmail.com
u....@gmail.com
u...@gmail.com

and then do the same for @googlemail.com

As you can see this approach doesn't scale

> 
> Not taking it as criticism (not that it would make any difference to me),
> and I'm trying to give nothing but constructive suggestions back, as we all
> should. :-)

Of course! I just wanted to explicitly say that to avoid giving the impression
of wanting to "bully" you in writing a function only for our convenience

I hope we'll reach a consensus on this :)

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to