A preview of the coming version 2.8.2 of amavisd-new is available at: http://www.ijs.si/software/amavisd/amavisd-new-2.8.2-rc1.tar.bz2 http://www.ijs.si/software/amavisd/amavisd-new-2.8.2-rc1.tar.xz
Release notes are at: http://www.ijs.si/software/amavisd/release-notes.txt amavisd-new-2.8.2-RC1 release notes Contents: COMPATIBILITY BUG FIXES NEW FEATURES OTHER WHY REDIS? COMPATIBILITY There are no incompatible changes since the previous release. The version 2.8.2 drops dependency on a Perl module Redis, and makes dependency on modules Convert::TNEF and Convert::UUlib truly optional. BUG FIXES - if SQL logging was disabled a pen pals feature was non-functional even when a Redis storage backend was available and collecting data; now pen pals is fully functional with a Redis database backend and no SQL; - provide our own Redis client code, avoiding Redis CPAN module bugs, its slowness and non-support for IPv6. The noteworthy Redis CPAN module bug is the #38 (failing to re-select a non-zero-index database after an automatic re-connect to a server). See: https://github.com/melo/perl-redis/issues/38 https://github.com/melo/perl-redis/issues/28 - fixed a regexp in parsing wildcarded signing domain in a DKIM key declaration and in a wildcarded sender pattern in signing options (an exotic feature rarely used, compatibility with dkim_milter); - drop hard-coded dependency on modules Convert::TNEF and Convert::UUlib. The Convert::TNEF was made optional in amavisd-new-2.8.0, but the program still failed if the module could not be loaded at startup. Both of these modules are now loaded at run time when first used, subject to @decoders setting. The use of module Convert::UUlib (the do_ascii entry) is disabled in a default setting of @decoders, and the module Convert::TNEF (the do_tnef entry) is not used if an external TNEF decoder (the do_tnef_ext entry) is available, or if it is disabled in the @decoders list. NEW FEATURES - IP address reputation When a Redis storage backend is enabled, besides the existing pen pals functionality, it now also offers information updating and retrieval on IP address reputation. This function is enabled by default when @storage_redis_dsn is nonempty, but can be disabled by setting $enable_ip_repu to false (to 0 or undef), per policy bank if necessary. For each mail message a list of public IP addresses is collected from its 'Received' trace header fields in a mail header section. A redis server maintains a database of each IP address encountered. For each IP address an entry carries the following counters: a number of spam messages having this IP address in a trace header, a number of ham messages, a number of banned or infected messages, and a total number of messages. Also a timestamp of the last encounter is kept (currently only used for logging purposes). Each entry is subject to automatic expiry, so that infrequently encountered IP addresses are eventually automatically purged from a database. When a new mail message is being processed, a lookup on all its public IP addresses from a trace is done. For each IP address found in a database a spam score is computed based on a ratio of ham versus all messages, and based on a total number of messages. The largest spam score of all encountered IP addresses is then contributed as a spam score of a message. A formula for computing spam score of each IP address is currently hard-coded, is non-linear and takes into account the total number of encounters, diluted by the ratio of ham messages versus all messages seen with this IP address. The computed score cannot be negative, i.e. the IP reputation can only contribute to spamminess of a message and cannot serve as a 'whitelisting' negative score. A time-to-live of each IP entry is assigned dynamically: frequently encountered IP addresses are given longer expiration times (days), infrequent IP addresses are short-lived and eventually expire, typically in few hours. It is possible to exclude certain IP addresses or networks from contributing spam score by listing them in an @ip_repu_ignore_networks list, e.g.: @ip_repu_ignore_networks = qw( 192.0.2.44 192.0.2.45 198.51.100.0/24 2001:db8::1:25 ); This does not preclude a redis lookup or updating counts on an IP addresses matching the list, but just clears a resulting score to zero. The mechanism is appropriate for excluding site's own mailers (MSA and MX), or local (e.g. departmental) mailers, which may on occasion emit a spammy message, but should never receive a score penalty. There is no need to include private IP address networks in the list, as these are already exempt from IP reputation database. An associated list of lookup tables @ip_repu_ignore_maps (whose only default entry is the \@ip_repu_ignore_networks) offers more flexibility if needed, and is a member of policy banks. Like other self-learning mechanisms (e.g. SpamAssassin's auto-learn, and AWL), the quality of a result depends on a quality of other spam-gauging rules - the better spam/ham classification works (SpamAssassin), the more useful IP reputation becomes. For the purpose of IP reputation's spam and ham counts, a mail is considered spam if it is flagged with a contents category CC_SPAM or CC_SPAMMY (i.e. at tag2_level or above), and is considered ham when its final score is below 2.0. Intermediate scores are considered unclassified. A nice feature of the mechanism is that it reacts fairly quickly to a new rush-in of unwanted messages from some IP address, either foreign, or local. For insight on the IP address reputation behaviour, search the log for ' redis: IP '. At log level 2 only spammy hits are logged, at log level 3 also the clean hits are shown. The log entry shows spam, ham, banned+infected and unclassified counts for an IP address, a percentage of unwanted (spam+banned+infected) messages out of the total count, and the associated score. Apart from starting a redis server on a loopback interface (except for changing its 'bind' setting in redis.conf, no other configuration changes are necessary, a database need not be initialized), here is an example configuration in amavisd.conf: @storage_redis_dsn = ( { server => '127.0.0.1:6379', db_id => 1 }, ); # list your MX and MSA mailer IP addresses or networks here: @ip_repu_ignore_networks = qw( 192.0.2.44 2001:db8::/64 ); A redis server needs to support Lua scripting, which is available since version 2.6. Support for IPv6 is available since version 2.8.0. OTHER - dropped dependency on a CPAN module Redis, implementing our own client-side redis protocol implementation (Amavis::TinyRedis). It is faster and smaller, and supports opening sessions with a redis server over IPv6 (or over IPv4 or over a Unix socket). The redis server supports IPv6 starting with version 2.8.0. Currently supported options in @storage_redis_dsn are: server, db_id, password, and ttl. The 'server' specifies an INET or INET6 socket (a host IP address or name and a port number) or an absolute path to a Unix socket. An IPv6 address must be enclosed in square brackets. The default value is '127.0.0.1:6379'. Match this with your redis configuration. Option 'db_id' specifies a redis database index (given to a "SELECT" redis command). Its value is a (small) integer, defaults to 0. This allows for independent databases to co-exist on the same redis server, e.g. an amavis database and a SpamAssassin Bayes database. The 'ttl' option can override a global setting $storage_redis_ttl on a per-server basis. Its value is an integer, representing a number of seconds for expiration time of pen pals records. It defaults to $storage_redis_ttl, which in turn defaults to 16 days (in seconds). This setting does not affect IP reputation records, whose expiration time is computed dynamically. Example: $storage_redis_ttl = 22*24*3600; # 22 days for pen pals records @storage_redis_dsn = ( # alternative servers, use the first which works { server => '[::1]:6379', db_id => 1 }, { server => '127.0.0.1:6379', db_id => 1, password => 'abc...' }, { server => '/tmp/redis.sock', db_id => 1, ttl => 8*24*3600 }, ); Btw, make sure to keep the setting $database_sessions_persistent at its default value (1, i.e. enabled), otherwise Redis performance will suffer somewhat. - store only essential information for pen pals operation to a Redis storage backend to save memory on a database server; information on inbound messages is no longer stored there, i.e. only information on originating messages is kept; - more informative logging of pen pals query results when using a Redis storage backend. The redis support code (Lua and protocol handling) was largely rewritten for efficiency since amavisd-new 2.8.1. - added LDAP attribute amavisDisclaimerOptions 1.3.6.1.4.1.15312.2.2.1.47 to LDAP.schema; contributed by Quanah Gibson-Mount; - filter for public IP addresses from a Received trace only once; - add one digit of precision in the TIMING log report to reported small elapsed times (below 5 ms); - documentation README.sql-mysql: added "CREATE INDEX msgs_idx_mail_id..." with a note on an InnoDB requirement for a foreign key; by Jernej Porenta; WHY REDIS? A redis database was chosen initially because SpamAssassin 3.4.0 supports keeping its Bayes database in a redis server, which makes it very fast, so this makes a redis database readily available to amavisd too. Redis has some features that make it suitable for use as a pen pals database, for Bayes storage, and now for IP reputation: - automatic expiration of entries based on key's individual time-to-live setting makes explicit database maintenance unnecessary; - accessible over inet (or Unix sockets) allows several amavisd hosts to use a common redis server, possibly running on a dedicated host; - supports Lua scripting, which makes it possible to perform multiple basic operations in one go as a single application's functional operation. It reduces multiple network round-trip times to a single network transaction, reducing network packet rate and latency; - compared to SQL storage for pen pals (and for Bayes database), the redis read speed is faster, but the write speed is MUCH faster; - as an im-memory database with optional periodic disk persistence it makes it suitable for use as a pen pals, as IP reputation and as Bayes storage: it is fast, and a potential redis server restart reloads data from the last snapshot, thus only losing the last minute or two of updates when trouble strikes, which is acceptable for these three databases. - makes it possible to eliminate SQL r/w storage if its only purpose was to provide pen pals functionality (and SpamAssassin's Bayes); Mark
