Re: price less
There are similar mailings for other kinds of "customers". MongoDB customers, MariaDB Users, SugarCRM users, Unix Software users. I have a bunch of rules against them. If I send samples they won't make it through our filters. On 4/03/2023 19:05, Benny Pedersen wrote: Hello, I would like to know if you are interested in acquiring Colocation Customer List. Information fields: Names, Title, Email, Phone, Company Name, Company URL, Company physical address, SIC Code, Industry, and Specialty (Revenue and Employee). Please let me know your target geography so that I will get back to you with the counts, Pricing, and more information. Regards, Sarah Marketing Executive spamasssin tag X-Spam-ASN: AS15169 GOOGLE Return-Path: X-Spam-Status: No, score=2.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,DKIM_VALID_EF,DMARC_PASS,FREEMAIL_FROM,HTML_MESSAGE, ITA_GMAIL_UNDISCLOSED,KAM_LIST3_1,RCVD_IN_DNSWL_NONE,RELAYCOUNTRY_GREY, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=4.0.0 maybe time for a new job without any urls ?
Re: How to incorporate network blocks
Actually, ipset supports - syntax: CREATE-OPTIONS := range fromip-toip|ip/cidr [ netmask cidr ] [ timeout value ] [ counters ] [ comment ] [ skbinfo ] On 11/11/2022 18:10, Bill Cole wrote: On 2022-11-11 at 11:26:13 UTC-0500 (Fri, 11 Nov 2022 09:26:13 -0700) Grant Taylor via users is rumored to have said: On 11/11/22 9:09 AM, Bert Van de Poel wrote: - IP/CIDR lists like the one you mention, but also lists like Stop Forum Spam (https://www.stopforumspam.com/) I cron fetch then add to an ipset with a DROP (which is quite similar to what others are suggesting). Stop Forum Spam seems interesting. I'd be curious to see how you're converting SFS list(s) to ipset entries. Mostly I've not yet had enough coffee to convert from a range of IPs; -, to CIDR; /. From my bashrc... # type cidrcon cidrcon is a function cidrcon () { for a in $*; do echo $a; done | perl -e "use Net::CIDR::Lite; \$cidr = Net::CIDR::Lite->new(<>) ; \$_ = join (\"\n\",\$cidr->list) ; print \"\$_\n\";" } Obviously requires Perl and the Net::CIDR::Lite module. I do not recall why the implementation is so weird, but I've been using it for decades(!?) I didn't pay close attention to the list, but I did see that it was range based and would need some conversion. -- I have added it to my pile of things to look at more closely later. -- Grant. . . . unix || die
Re: How to incorporate network blocks
I've been dealing with IP blocklists using two other methods before email even reaches SA: - In postfix my smtpd_recipient_restrictions includes "reject_rbl_client zen.spamhaus.org, reject_rhsbl_reverse_client dbl.spamhaus.org, reject_rhsbl_helo dbl.spamhaus.org, reject_rhsbl_sender dbl.spamhaus.org" and I'm guessing potentially others could be added. - IP/CIDR lists like the one you mention, but also lists like Stop Forum Spam (https://www.stopforumspam.com/) I cron fetch then add to an ipset with a DROP (which is quite similar to what others are suggesting). I find that those are quite suitable. Bert On 10/11/2022 18:05, Grant Taylor via users wrote: On 11/10/22 9:54 AM, Joey J wrote: Hello All, Hi, I'm trying to see if there is a way to incorporate network ranges into SA to essentially flag messages. N.B. at least one of the lists below is individual IPs and not networks / ranges of IPs. -- I'm not sure how to square that peg with your wants / needs. I know I can use iptables and reject it before getting to SA, but in some cases we would have legit email get flagged within these bigger blocks. I would suggest investigating the other offerings from each vendor. I suspect there is a good chance that many, if not all, of them offer a DNS based query method. See Riccardo's comment about Spamhaus / Spamteq. I'm trying to incorporate: feeds.dshield.org/block.txt spamhaus.org/drop/drop.lasso ciarmy.com/list/ci-badguys.txt openbl.org/lists/base.txt Short of that, it wouldn't be hard to turn them into a locally hosted BL and then configure SpamAssassin to query it.
Re: subscribe to blacklist for domains
I think what Noel is referring to is Postfix configuration like this for example: smtpd_recipient_restrictions = permit_mynetworks, permit_sasl_authenticated, reject_unauth_destination, reject_rbl_client zen.spamhaus.org, reject_rhsbl_reverse_client dbl.spamhaus.org, reject_rhsbl_helo dbl.spamhaus.org, reject_rhsbl_sender dbl.spamhaus.org, reject_non_fqdn_recipient, reject_unknown_recipient_domain Notice the spamhaus links for different blocklist settings. On 13/08/2022 15:38, joe a wrote: On 8/12/2022 11:43 PM, Noel Butler wrote: Why are you not blocking with blacklists at the border, ie: MTA. I'm not familiar with how to do that or if it can be done. Since SA offers this functionality, so did not even consider that. I'll look into it. Given its 0 resources for your MTA, with anti spam checking on SA often using significant resources (depending on traffic/number of tests/rules etc), its best to stop it getting to SA in the first place. SA also has this by-default list of domains that it never checks, for along time I have disagreed with this, we are the ones to decide who gets whitelisted not SA, not some paid third party, the option clear_uridnsbl_skip_domain however prevents this, but then you have to locate and 0 all the general rulesets scores that are whitelists as well. The configuration/usage of those lists causes me great frustration. Semi retirement and infrequent "tech stuff" may be partly to blame.
Re: Spam with Pyzor and DCC scores
On 11/07/2022 15:44, Matus UHLAR - fantomas wrote: On 11.07.22 12:57, Bert Van de Poel wrote: A few times a month we have spam messages getting through, often in German, that have some spam score but not enough to be marked/discarded. Always these messages are marked by DCC, since they're of course bulk spam, but it's also not uncommon to see Pyzor as well. I've been wondering if there are realistic cases where both DCC and Pyzor would mark as spam while the message was ham. this is likely to happen if the message is empty or learly empty. some people are stupid, send one-two words or a short link in message without Subject: ... Oh yeah, that's a case I hadn't thought of, good point! I feel like when both co-occur it's a pretty solid sign it's spam. Therefore, I'm wondering if an upstream amplification (or a local one) would make sense. Some examples (I can also supply full emails, but fear this might prevent my message from arriving): X-Spam-Status: No, score=4.082 tagged_above=- required=5 tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_IMAGE_RATIO_08=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01] X-Spam-Status: No, score=4.816 tagged_above=- required=5 tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_IMAGE_ONLY_28=0.726, HTML_IMAGE_RATIO_02=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_REMOTE_IMAGE=0.01, T_SCC_BODY_TEXT_LINE=-0.01] X-Spam-Status: No, score=4.109 tagged_above=- required=5 tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.029, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_IMAGE_RATIO_04=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01] looks like you should implement bayes. since these are generated by amavis, you could train amavis database. We have Bayes running on the main server, but my own local server doesn't have it so hence why it's missing. I did however take all spam I received myself in 2022 that wasn't caught and fed it to sa-learn (for the amavis user), thx for that suggestion. Let's hope it works to remove this minor inconvenience :)
Spam with Pyzor and DCC scores
Hi everyone, A few times a month we have spam messages getting through, often in German, that have some spam score but not enough to be marked/discarded. Always these messages are marked by DCC, since they're of course bulk spam, but it's also not uncommon to see Pyzor as well. I've been wondering if there are realistic cases where both DCC and Pyzor would mark as spam while the message was ham. I feel like when both co-occur it's a pretty solid sign it's spam. Therefore, I'm wondering if an upstream amplification (or a local one) would make sense. Some examples (I can also supply full emails, but fear this might prevent my message from arriving): X-Spam-Status: No, score=4.082 tagged_above=- required=5 tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.25, HTML_IMAGE_RATIO_08=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01] X-Spam-Status: No, score=4.816 tagged_above=- required=5 tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.001, HEADER_FROM_DIFFERENT_DOMAINS=0.248, HTML_IMAGE_ONLY_28=0.726, HTML_IMAGE_RATIO_02=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_REMOTE_IMAGE=0.01, T_SCC_BODY_TEXT_LINE=-0.01] X-Spam-Status: No, score=4.109 tagged_above=- required=5 tests=[DCC_CHECK=1.1, DIGEST_MULTIPLE=0.001, FSL_BULK_SIG=0.029, HEADER_FROM_DIFFERENT_DOMAINS=0.249, HTML_IMAGE_RATIO_04=0.001, HTML_MESSAGE=0.001, MIME_HTML_ONLY=0.1, PYZOR_CHECK=1.985, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.652, T_SCC_BODY_TEXT_LINE=-0.01] What's people's opinion here? Kind regards, Bert Van de Poel ULYSSIS
Re: Spamassassin spamming in log
If you are using systemd, you can "systemctl disable spamd". Otherwise you can indeed use the enabled=0. I would probably do both just in case ;) On 2/06/2022 20:36, Timo Brandt wrote: Maybe one of you has a hint for me how to disable the automatic startup of spamd? Its been a long time ago that I setup a Debian from scratch :-( It seems that spamd doesnt need to start at system boot so I will disable it. Will this be done when I add ENABLED=0 into the file /etc/default/spamassassin ? Thanks, Timo Am 2022-06-02 20:27, schrieb Timo Brandt: Hi all, indeed - sorry. I wasnt aware of that I do not need to run spamd beside amavis 若 Thanks for all your help. Timo Am 2022-06-02 20:18, schrieb Matija Nalis: On Thu, Jun 02, 2022 at 02:47:28PM +0200, Bert Van de Poel wrote: For the errors about nonexistent uses you will want to have a look at /etc/default/spamassassin I'm guessing. For the info messages: this has just got to do with your logging level. You will want to decrease it in local.cf or maybe also in the default file. Also, depending on your distro and init system, /etc/default/spamassassin might not be processed (e.g. on Debian systems, in many cases /etc/default/* entries are only read via /etc/init.d/* System-V-init scripts, and not used when using default systemd init system). You should use "ps auxw" to determine with what exactly parameters it is being run, and then grep the system for those flags if different from ones in /etc/default/spamassassin (esp. when you change that file and restart, but changes are not applied)
Re: Spamassassin spamming in log
Did you restart the unit after changing the configuration? It does seem like debian-spamd is indeed the correct user. I'm not sure how exactly the spawning of children works within SA. Has your CPU usage decreased now? PS: you can just reply to the list, no need to email me personally every time, that just causes me to get each message twice. On 2/06/2022 15:17, Timo Brandt wrote: Hi Bert, I checked the user table: debian-spamd:x:114:120::/var/lib/spamassassin:/usr/sbin/nologin And also adjusted the config file: OPTIONS="-u debian-spamd --create-prefs --max-children 5 --helper-home-dir -s /var/log/spamassassin/spamd.log But process is already running under root: Am 2022-06-02 15:13, schrieb Bert Van de Poel: For the error: does the spamd user actually exist? that's a requirement of course. I've always controlled SA loglevels through amavis, but from the spamd man page I would expect that it's related to -D. I'm not completely sure what the default is. http://wiki.apache.org/spamassassin/DebugChannels <http://wiki.apache.org/spamassassin/DebugChannels> is listed for more information. I expect your high CPU usage is just coming from SA trying to spawn children as a user that doesn't exist though. On 2/06/2022 14:57, Timo Brandt wrote: Hi Bert, many thanks for your answer. Please find the spamassassin config below. I already checked it but do not find anything to change which is stopping the flooding. Also, spamassassin is consuming mostly all of my CPU power. When its running, the CPU is nearly the whole time at 99%. When I stop spamassassin, the CPU consumption is going down. # /etc/default/spamassassin # Duncan Findlay # WARNING: please read README.spamd before using. # There may be security risks. # Prior to version 3.4.2-1, spamd could be enabled by setting # ENABLED=1 in this file. This is no longer supported. Instead, please # use the update-rc.d command, invoked for example as "update-rc.d spamassassin enable", to enable the spamd service. # Options # See man spamd for possible options. The -d option is automatically added. ENABLED=1 # SpamAssassin uses a preforking model, so be careful! You need to # make sure --max-children is not set to anything higher than 5, # unless you know what you're doing. OPTIONS="--create-prefs --max-children 5 --helper-home-dir --username spamd --helper-home-dir /home/spamd -s /var/log/spamassassin/spamd.log # Pid file # Where should spamd write its PID to file? If you use the -u or # --username option above, this needs to be writable by that user. # Otherwise, the init script will not be able to shut spamd down. PIDFILE="/var/run/spamd.pid" # Set nice level of spamd #NICE="--nicelevel 15" # Cronjob # Set to anything but 0 to enable the cron job to automatically update # spamassassin's rules on a nightly basis CRON=1 Am 2022-06-02 14:47, schrieb Bert Van de Poel: For the errors about nonexistent uses you will want to have a look at /etc/default/spamassassin I'm guessing. For the info messages: this has just got to do with your logging level. You will want to decrease it in local.cf or maybe also in the default file. On 2/06/2022 14:33, Timo Brandt wrote: Hi all, I have a running debian 11 with postfix/dovecot and Amavis with clamav / spamassassin. I saw that the spamassassin logfile is growing very fast and found the following entries occuring many times per second. Can you maybe help me to get this fixed? I searched along the internet but did not find really a solution. Do you need any config files to check? Thanks for your help in advance, Timo Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849690] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849691] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849692] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849693] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849698 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849696] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1849698] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849699 Thu Jun 2 11:43:11 2022 [1848608] info: prefork: child states: SS Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849700 Thu Jun 2 11:43:11 2022 [
Re: Spamassassin spamming in log
For the error: does the spamd user actually exist? that's a requirement of course. I've always controlled SA loglevels through amavis, but from the spamd man page I would expect that it's related to -D. I'm not completely sure what the default is. http://wiki.apache.org/spamassassin/DebugChannels is listed for more information. I expect your high CPU usage is just coming from SA trying to spawn children as a user that doesn't exist though. On 2/06/2022 14:57, Timo Brandt wrote: Hi Bert, many thanks for your answer. Please find the spamassassin config below. I already checked it but do not find anything to change which is stopping the flooding. Also, spamassassin is consuming mostly all of my CPU power. When its running, the CPU is nearly the whole time at 99%. When I stop spamassassin, the CPU consumption is going down. # /etc/default/spamassassin # Duncan Findlay # WARNING: please read README.spamd before using. # There may be security risks. # Prior to version 3.4.2-1, spamd could be enabled by setting # ENABLED=1 in this file. This is no longer supported. Instead, please # use the update-rc.d command, invoked for example as "update-rc.d spamassassin enable", to enable the spamd service. # Options # See man spamd for possible options. The -d option is automatically added. ENABLED=1 # SpamAssassin uses a preforking model, so be careful! You need to # make sure --max-children is not set to anything higher than 5, # unless you know what you're doing. OPTIONS="--create-prefs --max-children 5 --helper-home-dir --username spamd --helper-home-dir /home/spamd -s /var/log/spamassassin/spamd.log # Pid file # Where should spamd write its PID to file? If you use the -u or # --username option above, this needs to be writable by that user. # Otherwise, the init script will not be able to shut spamd down. PIDFILE="/var/run/spamd.pid" # Set nice level of spamd #NICE="--nicelevel 15" # Cronjob # Set to anything but 0 to enable the cron job to automatically update # spamassassin's rules on a nightly basis CRON=1 Am 2022-06-02 14:47, schrieb Bert Van de Poel: For the errors about nonexistent uses you will want to have a look at /etc/default/spamassassin I'm guessing. For the info messages: this has just got to do with your logging level. You will want to decrease it in local.cf or maybe also in the default file. On 2/06/2022 14:33, Timo Brandt wrote: Hi all, I have a running debian 11 with postfix/dovecot and Amavis with clamav / spamassassin. I saw that the spamassassin logfile is growing very fast and found the following entries occuring many times per second. Can you maybe help me to get this fixed? I searched along the internet but did not find really a solution. Do you need any config files to check? Thanks for your help in advance, Timo Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849690] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849691] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849692] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849693] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849698 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849696] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1849698] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849699 Thu Jun 2 11:43:11 2022 [1848608] info: prefork: child states: SS Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849700 Thu Jun 2 11:43:11 2022 [1848608] info: prefork: adjust: 0 idle children less than 1 minimum idle children. Increasing spamd children: 1849700 started. Thu Jun 2 11:43:11 2022 [1849699] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1849700] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1848608] info: prefork: child states: SSS Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849701 Thu Jun 2 11:43:11 2022 [1848608] info: prefork: adjust: 0 idle children less than 1 minimum idle children. Increasing spamd children: 1849701 started. Thu Jun 2 11:43:11 2022 [1848608] info: prefork: child states: Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849702 Thu Jun 2 11:43:11 2022 [1849701] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1848608] info: prefork: adjust: 0 idle children less than 1 minimum idle children.
Re: Spamassassin spamming in log
For the errors about nonexistent uses you will want to have a look at /etc/default/spamassassin I'm guessing. For the info messages: this has just got to do with your logging level. You will want to decrease it in local.cf or maybe also in the default file. On 2/06/2022 14:33, Timo Brandt wrote: Hi all, I have a running debian 11 with postfix/dovecot and Amavis with clamav / spamassassin. I saw that the spamassassin logfile is growing very fast and found the following entries occuring many times per second. Can you maybe help me to get this fixed? I searched along the internet but did not find really a solution. Do you need any config files to check? Thanks for your help in advance, Timo Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849690] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849691] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849692] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849693] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849698 Thu Jun 2 11:43:11 2022 [1848608] info: spamd: handled cleanup of child pid [1849696] due to SIGCHLD: exit 255 Thu Jun 2 11:43:11 2022 [1849698] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849699 Thu Jun 2 11:43:11 2022 [1848608] info: prefork: child states: SS Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849700 Thu Jun 2 11:43:11 2022 [1848608] info: prefork: adjust: 0 idle children less than 1 minimum idle children. Increasing spamd children: 1849700 started. Thu Jun 2 11:43:11 2022 [1849699] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1849700] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1848608] info: prefork: child states: SSS Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849701 Thu Jun 2 11:43:11 2022 [1848608] info: prefork: adjust: 0 idle children less than 1 minimum idle children. Increasing spamd children: 1849701 started. Thu Jun 2 11:43:11 2022 [1848608] info: prefork: child states: Thu Jun 2 11:43:11 2022 [1848608] info: spamd: server successfully spawned child process, pid 1849702 Thu Jun 2 11:43:11 2022 [1849701] error: spamd: cannot run as nonexistent user or root with -u option Thu Jun 2 11:43:11 2022 [1848608] info: prefork: adjust: 0 idle children less than 1 minimum idle children. Increasing spamd children: 1849702 started. Thu Jun 2 11:43:11 2022 [1848608] info: prefork: child states: S
Re: [SPAM?] Re: Memory requirement for SpamAssassin/Postfix/Roundcube/Dovecot stack
If you want to save on memory usage, just having amavis filter out exe files or exe-like files (screensavers, exes in archives, etc.) is much more efficient than using clamav. Of course this doesn't filter out Office macros/OLE, but there's a plugin in SA related to that, I believe. On 26/05/2022 16:49, Ian Evans wrote: On Thu, May 26, 2022, 10:36 AM Reindl Harald, wrote: Am 26.05.22 um 16:32 schrieb Ian Evans: > File under "questions I think I already know the answer to." > > Looking at moving my site to a new host and I'm pondering splitting my > web/email servers which have always shared the same server. > > Our email server is five accounts. Just me and the missus. A big day is > receiving 200 emails. > > Is it safe to assume that a $5/mth 1gig memory account will laugh at the > resources needed to run a SpamAssassin/Postfix/Roundcube/Dovecot/Nginx > stack and not ever break a sweat? when you add clamav later it will be clamav who laughs about 1 GB memory after it has sucked it completly Just looked at clamav's memory usage. Ouch. :)
Regex error in most recent update
Hi everyone, I just noticed we had two email servers complain last night after running sa-update about a regex problem: /etc/cron.daily/spamassassin: config: invalid regexp for __URI_TRY_3LD 'm,^https?://(?:try(?!r\.codeschool)|start|get(?!\.adobe)|save|check(?!out)|act|compare|join|learn(?!ing)|request|visit(?!or|\.vermont)|my(?!sub|turbotax|news\.apple|a\.godaddy|account|support|build|blob)\w)[^.]*\.[^/]+\.(?Variable length lookbehind is experimental in regex; marked by <-- HERE in m/(?i)^https?://(?:try(?!r\.codeschool)|start|get(?!\.adobe)|save|check(?!out)|act|compare|join|learn(?!ing)|request|visit(?!or|\.vermont)|my(?!sub|turbotax|news\.apple|a\.godaddy|account|support|build|blob)\w)[^.]*\.[^/]+\.(?<-- HERE / channel 'updates.spamassassin.org': lint check of update failed, channel failed sa-update failed for unknown reasons Did anyone else notice the same thing or is it just on our end? Kind regards, Bert
Re: Do these domains merit blocking?
You can find the email we received from them here http://paste.debian.net/1223611/ (just the body, idk if anyone also want headers) Must admit I thought it was a scam, just because it was its own domain, out of the blue and as many have mentioned unsolicited. Bert On 15/12/2021 19:24, Charles Sprickman wrote: Does anyone have a sample of one of their emails? I’m composing a brief nastygram and would like to get my eyes on one before finishing up. Thanks, Charles On Dec 15, 2021, at 11:39 AM, Bill Cole wrote: There has recently been a spate of odd spams to harvested addresses asking hypothetical questions about domains' privacy practices. It turns out this is a grad student enrolling human subjects in a study without informed consent... The explanation is at https://measurement.cs.princeton.edu/privacystudy/ and there is a list of domains there which were created to run this maldesigned study. Many of the early batch compounded the consent problem with outright fraud, claiming to be from people who do not exist. I am curious about what the SA user world thinks of such domains. My personal opinion is that the grad student, his faculty advisors, and his IRB should all be forced to find new careers and the domains should have a null CNAME at the root forever. It appears that URIBL, SURBL, and Spamhaus DBL have all noticed the domains unflatteringly, which I suppose constitutes a more balanced consequence... A customer has expressed mild dismay at the concept that a fine research institution should be "punished for doing research." I'm less attached to Princeton than my NJ-based customer and (having worked in a NIH-funded lab) less idolizing of the Ivory Tower in general. I have no difficulty explaining my position, but I am rather surprised that I need to in 2021. Am I missing something special that makes such research spam somehow not spam? -- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Not Currently Available For Hire
Re: why are not all rules run all the time
DNSWL is a whitelist for mailservers. So the tests based on that use the IP that handed your trusted_networks the email. Several tests are based on the transmitting server instead of just the email contents, since contents can be convincing or not, if the server is notorious for sending spam it will end up on blocklists for example. On 8/10/2021 11:57, Thomas Seilund wrote: On 10/8/21 11:38 AM, Matus UHLAR - fantomas wrote: On 08.10.21 11:18, Thomas Seilund wrote: I run SA 3.4.2 on Debian GNU/Linux 10 (buster) If I look at incomming mails after SA has processed the incomming mail then the list of SA rules that have been run is not the same for all mails. Below are to examples: X-Spam-Status: No, score=-2.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, T_KAM_HTML_FONT_INVALID autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Status: No, score=2.9 required=3.0 tests=BAYES_50,HTML_MESSAGE, HTML_MIME_NO_HTML_TAG,MIME_HTML_ONLY,SPF_HELO_PASS,URIBL_BLACK autolearn=no autolearn_force=no version=3.4.2 For instance, rule RCVD_IN_DNSWL_NONE is run for the first mail but not for the second. Why is that? perhaps the rule did not match, that's how spam score is evaluated. did those mails come from the same host? Thanks. No mails did not come from the same host. I am a little in the dark here! Why does it matter where the mails came from? In my /etc/spamassassin/local.cf I have nothing about trusted networks. Is it so that the list of rules only show rules that contribute to the score? What do you mean by a rule did not match?
Re: Disabling autolearn on given rule
This is complete news to me! Based on the activity on the dev list, I had assumed there were still 10-20 people devoting some of their time to developing SA. If you are the only one, that of course changes my view very much, and would be something worth communicating in some spot. When I asked about my Bayes bugs in this list a long time ago, I also got very mixed responses on whether my suggested solutions to the bugs I found through discussion on the list were actually the right ones, so I filed those bugs specifically to get feedback on whether my solutions were deemed acceptable by SA developers (assuming there was a whole team working on SA either in the evenings or as part of their job at a company that heavily uses SA). If the idea is that bugs will most probably never get resolved except if you write and submit patches to solve them, that's completely understandable if there are barely any developers or maintainers, but then people have to be told of course. Maybe it would then also be a good idea to start some kind of bug review project, similar to how projects like Inkscape have been asking their community to retest *all* bugs, where members from the mailing list and other SA users are encouraged to go through a few bugs at a time, starting with the very oldest ones, to check whether they're still valid and otherwise close them. There are currently 373 unresolved bugs on bugzilla (if that counter can be trusted, it's the same amount of bugs I get under "my bugs", which seems suspicious), I wouldn't be surprised if over half of those were questions or about things that have long been resolved or become irrelevant. For example, I'm guessing https://bz.apache.org/SpamAssassin/show_bug.cgi?id=5679 can be closed since if this problem had persisted, there would be a ton of reports of those still ongoing. What do you think? I would also like to point out, as a sort of PS, that while I do understand that Perl isn't rocket science, there is quite a barrier due to Perl's reputation and the decreasing number of people with experience in Perl. If I'm brutally honest, I would have probably already fixed those 4 bugs I reported myself if SA was on GitHub and written in Python, since I could most probably read the code more easily, and especially submit my changes more easily. I do understand that SA is like that for historic reasons, and I don't think a rewrite would be sensible at all, but I wouldn't underestimate how much of a deterrent the combination of Perl, Bugzilla, SVN and email patch submission is for new FOSS developers used to the newer languages and GitHub. I for one have no idea how I would submit a fix to SA once I've written it, to give a concrete example. I'm guessing I just paste the patch to a Bugzilla comment and hope someone merges it? Anyway, this is way offtopic for Matt's initial issue, but probably still relevant since he's hoping to fix it himself. On 22/09/2021 10:54, Henrik K wrote: On Wed, Sep 22, 2021 at 10:45:43AM +0200, Bert Van de Poel wrote: I hope I'm not passing on too much of a negative message. It would be great of someone had a look at the Bayes autolearn code. I think it would be a great service to the community! The fact is that there really aren't any active developers around these days. We are no different from any other semi-active open source project. I can only give so much of personal free time to "service the community". The community is supposed to try to take care of itself, so where are all the volunteers? :-) Doing Perl is not rocket science, but getting familiar with SA internals can be daunting. I can help with that, but someone needs to step up with decend effort.
Re: Disabling autolearn on given rule
I think having a look at the code itself is a good idea. I'm not sure if it's up-to-date but you can find some information on https://cwiki.apache.org/confluence/display/SPAMASSASSIN/DevelopmentStuff I've found that just reporting issues on SA's bugzilla is completely useless since it's just used as a fancy interface to display email conversations of the development list. Newly reported bugs or issues often go ignored by email and their status is never changed since no one uses the interface to manage bugs, this means that bugzilla is filled to the brim with hundreds of bugs marked as new, of which some are actual bugs and large parts are just questions or fixed problems that were never closed. Bugzilla is also very buggy, for example when I press "my bugs", I get a list of 373 bugs, some predating the existence of my account, and obviously I didn't take part in the discussion of almost all of them. So keep in mind that Bugzilla can be untrustworthy and that the dev mailing list mentioned on https://cwiki.apache.org/confluence/display/SPAMASSASSIN/mailinglists is connected to that. If you're planning to work on the Bayes plugin, I can tell you there are several problems with it I've reported in the past that have gone ignored: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7904 https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7905 https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7906 https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7907 I assume many others have also reported valid bugs, but they can be hard to find between the many questions that have been asked on https://bz.apache.org/SpamAssassin/buglist.cgi?quicksearch=bayes_id=34478 and I'm also not too sure we can trust the search functionality. I hope I'm not passing on too much of a negative message. It would be great of someone had a look at the Bayes autolearn code. I think it would be a great service to the community! Bert On 22/09/2021 03:29, Matt Corallo wrote: On 9/21/21 18:01, Loren Wilton wrote: None of these seem to accomplish disabling learning for a specific rule I think the problem is that I believe Bayes works off of the total score, and probably only sees rule names as more tokens, if it sees them at all. If it indeed works off the total score, about all you can do is somehow tweak that score for a given rule or rule combination. Right, I expected roughly as much from the docs I could find. Two things, then: (1) maybe time to revisit the old discussions of providing this as a default feature?, (2) where would I go to look at building a plugin for this? Ideally something that ends up upstream, but though I can write code, I know no perl :). Matt
Re: Does anyone know what generates these email headers?
By default any PHP script that's sending an email will contain X-PHP-Originating-Script on several Linux distros, even though it's not the official default (see https://www.php.net/manual/en/mail.configuration.php , one of the first Google results). It's a pretty common occurrence to see that header in automated emails of all kinds (e.g. registration confirmation emails, notifications, login link emails). Alone it's a sign of spam nor ham, but combined with other things it can be interesting. The others don't ring a bell for me. Bert On 8/09/2021 23:27, Loren Wilton wrote: I'm getting a lot of mails with some very curious headers in them. I tried searching with Google, and it has never heard of many of these strings. Does anyone recognize what might be generating these headers? X-EOPTenantAttributedMessage X-EmailAdvisor X-Mxtb-Transitionid X-MG-Subscriptionuid X-PHP-Originating-Script X-EmailTransmit-type CMM-X-SID-Result CMM-X-AUTH-Result CMM-X-Message-Status X-OutGoing-Spam-Status X-EmailTransmit-aid X-rext Thanks! Loren --- This email has been checked for viruses by AVG. https://www.avg.com
Re: Office phish
SpamAssassin has plugins for PhishTank and OpenPhish. I would suggest you submit the link to them. You can also reach out to the domain provider, hosting provider(s) and other companies involved. On 30/06/2021 21:51, Alex wrote: Hi, Would anyone like to help me block this office phish? It includes an HTML file that presents an O365 login page: https://pastebin.com/JMSrY6KU More javascript in an HTML file.
Re: Gmail spam filters
Dear Bowie, I'm afraid this really isn't a question for this email list, since it has nothing to do with SpamAssassin. However, to not just send you off with nothing: IP reputation plays a big role for Google. If you're hosted by a provider like OVH, that seems to serve lots of cybercriminals, your IP might have been previously used for spamming and therefore just has a bad reputation already. Spammers nowadays also more often set up SPF, DKIM and DMARC properly. If you've made sure you have SSL/TLS enable, SPF, DKIM and DMARC set up, reverse DNS, DNS, and your email server's domain are all set up properly, then really the best thing you can do is give it time and ask people to mark your emails as "not spam" in the mean time. You may also consider changing providers/IP if you're with a more notorious provider. I'm afraid you really can't do much more. It's quite unfair but it's the way things work I'm afraid. But again, this really isn't a question for this list. Perhaps try Libera IRC, some forum or something like Reddit? Kind regards, Bert On 17/06/2021 17:42, Bowie Bailey wrote: This is a bit off-topic, but I'm hoping someone here might have some suggestions. We are having a problem getting mail to Gmail users. It almost always ends up in their spam folder. I have set up SPF, DKIM, and DMARC. The mail-tester.com email test gives a 10/10 for the test emails I have sent to it. The information I've been able to find from Google is completely unhelpful. I tried signing up for their postmaster tools, but my volume is too low to show any data. Does anyone have any tips on how to get mail through Gmail's spam filters? Thanks, Bowie
Re: Detect Emoticons in Subject
We've started getting lots of spam with emoji in the subject too the past few weeks, so I've looked into this as well. As mentioned by RW, you would need to create some kind of UTF8 regex header Subject rule. As I'm not too excited about writing such a regex, it's way at the bottom of my todo list to contemplate whether an SA plugin could be written for that and to then reach out to the SA developers to see whether that would be something upstream would accept. But honestly, I won't be able to any time soon (I don't have the time). Still, thought I'd mention it, since it might be relevant to your question. If you do end up figuring out a regex that works out and isn't an extreme length, I think plenty of people on this list would love to know! Bert On 20/05/2021 18:19, RW wrote: On Thu, 20 May 2021 11:42:59 -0400 Clive Jacques wrote: Hi, I've been using SA a long time. Lately, I'm getting more and more spam with emoticons in the subject line. I'd say about 90% of my emails with emoticons in the subject are spam. I'd like to create a local rule which scores email with emoticons in the subject. # Local Rule for Emoticons in subject subjectEMOTICON_IN_SUBJECT Subject =~ /\p{Emoticons}/ The rule should start with "header", that's what's causing the lint failure. However, AFAIK, the rule still won't work because \p{Emoticons} isn't supported in spamassassin, which works on byte sequences. You need to rewrite it to match UTF-8 bytes.
Re: Bayes autolearn: how does it resolve whether rules are body or header related?
Dear Loren, Thank you very much for your email. Based on your message I could deduce there were earlier messages (which I then read through a web archive). For some unexplained reason I never received the previous 3 responses to my email. I hope the university network isn't randomly over-filtering spam again (we've had those kinds of problems for a while now, it's quite a problem, we are much more careful about how we mark spam). Based on what I've read, I agree that this is indeed a bug (or actually several). I've filed the following bug reports: https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7904 (missing body types, as mentioned by RW) https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7905 (meta tflags=net tests are ignored) https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7906 (meta tflags!=net tests are always header tests) https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7907 (better support for meta tests in autolearning in general, with 2 possible solutions) Thank you very much to RW and Matus Uhlar for helping me figure out what code to look at and for al three of you to confirm that this is clearly a set of bugs. Feel free to file more bugs if you consider there are more based on my issue, as well as to give support, write suggestions or submit patches on the bugs I have already filed. Kind regards, Bert Van de Poel On 10/05/2021 06:41, Loren Wilton wrote: so you don't have points from body rules. your mentioned URI_DEOBFU_INSTR is a meta rule: meta URI_DEOBFU_INSTR __URI_DEOBFU_INSTR && !__MSGID_OK_HOST so maybe it's not considered. They are treated as header, or ignored if marked as net. I think a bug report should be submitted for this. Either they should be treated split 50/50 as header and body score, or when the metas are built they shoudl have a "body rule" flag, and that used to determine where the score goes. I tried, but for some reason apache decided that I'm evil and blocked the submission attempt, so someone else can do it. Loren
Bayes autolearn: how does it resolve whether rules are body or header related?
Dear fellow Spamassassin users, I recently noticed that quite a lot of spam emails with high scores weren't marked for Bayes autolearning. While some senders and receivers were a common match, explaining why autolearn was nog, there was no clear explanation for other cases. I therefore put Spamassassin in debug mode to check in more detail, and noticed that fairly often autolearn is not used because the minimum score for body tests isn't achieved. After looking at some specific cases, it seems however that several rules are either not considered when calculating the header rule score and body rule score for Bayes autolearning. I've always presumed these scores are calculated based on whether the underlying rule performs a regex on a header or on the body, but now I'm not so sure any more. I hope you can help clear up whether this is intended behaviour (and what that behaviour is) or whether I should report this as a bug. One example I noticed is URI_DEOBFU_INSTR=3.595. This is if I understand it correctly a URI test that's performed on the body. Should a test like this be counted towards the body score count? Then there's the question of meta rules such as MONEY_NOHTML. If you resolve the different meta levels within this rule, it's a combination of header and body, however it's only counted towards the header score. Finally, it seems as if custom rules I've added within local.cf aren't considered. Is that indeed the case (and if so, is that by design)? I'm also not completely sure if UNWANTED_BODY_LANGUAGE and tests like razor, pyzor and DCC are considered for body scores. Within the same realm, I'm also wondering whether these expected numbers for body and header can be tweaked and if so, how. For example the case below isn't autolearned even though it has a huge score and a vast amount of tests going off, but seemingly not enough body-related scores. Is that really the intended behaviour? May 8 10:40:32 mail amavis[4076058]: (4076058-16) header_edits_for_quar: -> , Yes, score=24.619 tag=- tag2=5 kill=7.5 tests=[ADVANCE_FEE_3_NEW_MONEY=0.001, AXB_XMAILER_MIMEOLE_OL_024C2=0.001, BAYES_50=0.8, BERT_KULSPAM=1, FORGED_MUA_OUTLOOK=1.927, FREEMAIL_FORGED_REPLYTO=2.095, FREEMAIL_REPLYTO=1, FREEMAIL_REPLYTO_END_DIGIT=0.25, FROM_MISSPACED=0.001, FROM_MISSP_EH_MATCH=0.001, FROM_MISSP_FREEMAIL=0.001, FROM_MISSP_MSFT=0.001, FROM_MISSP_REPLYTO=2.497, FSL_BULK_SIG=0.001, FSL_CTYPE_WIN1251=0.001, FSL_NEW_HELO_USER=0.001, KHOP_HELO_FCRDNS=0.398, LOTS_OF_MONEY=0.001, MISSING_HEADERS=1.021, MISSING_MID=0.497, MONEY_FREEMAIL_REPTO=1.202, MONEY_FROM_MISSP=0.001, MONEY_NOHTML=2.497, NSL_RCVD_HELO_USER=0.001, PYZOR_CHECK=1.392, REPLYTO_WITHOUT_TO_CC=1.552, REPTO_419_FRAUD=2.996, SPF_HELO_NONE=0.001, TO_NO_BRKTS_FROM_MSSP=1.593, TO_NO_BRKTS_MSFT=1.888, XFER_LOTSA_MONEY=0.001] autolearn=no autolearn_force=no Thank you in advance for your help. If you need any more examples or would us to run some tests, then feel free to let me know. Kind regards, Bert Van de Poel ULYSSIS
Re: Why does sa-compile access the bayes db?
Oh, I had misunderstood you, Matus. My bad! I thought you meant we should use a separate bayes db for every mailbox user, but now I understand you were referring to the amavis user which indeed runs everything. I just moved the existing bayes db (after stopping amavis of course) to the amavis user's .spamassassin folder and removed the path from local.cf and it seems to work just fine and indeed solves our issue with sa-compile. Thank you very much for the suggestion. This is a much cleaner solution than what I had initially in mind! On 28/05/2020 17:03, Matus UHLAR - fantomas wrote: On 28.05.20 15:32, Bert Van de Poel wrote: Almost all of the email we process are forwarders. It doesn't really make sense for us to do a non-global bayes db. The large majority of email we process is also for a uniform group: student organizations at our local university. you have apparently missed what I said before, so I repeat: you said you use amavis. amavis daemon runs (usually) under amavis user. Therefore, all mails processed by amavis use amavis' bayes database stored in amavis home directory. move the database to amavis' home (and chown it to the amavis user): # ls -la ~amavis/.spamassassin/ total 41368 drwx-- 2 amavis amavis 4096 May 28 16:59 . drwxr-x--- 7 amavis amavis 4096 May 28 06:50 .. -rw--- 1 amavis amavis 89136 May 28 17:01 bayes_journal -rw--- 1 amavis amavis 21065728 May 28 16:59 bayes_seen -rw--- 1 amavis amavis 40144896 May 28 16:59 bayes_toks -rw-r--r-- 1 amavis amavis 2304 May 5 12:41 user_prefs Then remove global setting of bayes database in /etc/spamassassin/local.cf and your problem will most probably to away. On 28.05.20 13:38, Bert Van de Poel wrote: We're using a global bayes_path defined in local.cf: On 28/05/2020 15:22, Matus UHLAR - fantomas wrote: This is your problem imho. if you use amavis, you need no bayes database, but amavis users', i guess in /var/lib/amavis/.spamassassin/ On 28/05/2020 10:18, Matus UHLAR - fantomas wrote: On 25.05.20 23:34, Bert Van de Poel wrote: Recently, we've been setting up Bayesian learning on our existing Amavis with Spamassassin setup on Ubuntu 18.04 (Spamassassin 3.4.2-0ubuntu0.18.04.3 and Amavis 1:2.11.0-1ubuntu1). We've decided to use a global db that was seeded with an aggregation of spam and ham we've received, then enabling autolearn to further train the set. As Spamassassin runs inside Amavis, the Bayes database files are owned by the amavis user. This setup works fine, and results for Bayes are great and growing in accuracy by autolearning. What was somewhat confusing is that we noticed our daily cronjob running sa-update and sa-compile was giving us an error concerning permissions: May 25 00:31:25.488 [8381] warn: bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: Permission denied bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: Permission denied I wonder where did these files come from. did you sety bayes_path in /etc/spamassassin/ ?
Re: Why does sa-compile access the bayes db?
Almost all of the email we process are forwarders. It doesn't really make sense for us to do a non-global bayes db. The large majority of email we process is also for a uniform group: student organizations at our local university. On 28/05/2020 15:22, Matus UHLAR - fantomas wrote: On 28.05.20 13:38, Bert Van de Poel wrote: We're using a global bayes_path defined in local.cf: This is your problem imho. if you use amavis, you need no bayes database, but amavis users', i guess in /var/lib/amavis/.spamassassin/ On 28/05/2020 10:18, Matus UHLAR - fantomas wrote: On 25.05.20 23:34, Bert Van de Poel wrote: Recently, we've been setting up Bayesian learning on our existing Amavis with Spamassassin setup on Ubuntu 18.04 (Spamassassin 3.4.2-0ubuntu0.18.04.3 and Amavis 1:2.11.0-1ubuntu1). We've decided to use a global db that was seeded with an aggregation of spam and ham we've received, then enabling autolearn to further train the set. As Spamassassin runs inside Amavis, the Bayes database files are owned by the amavis user. This setup works fine, and results for Bayes are great and growing in accuracy by autolearning. What was somewhat confusing is that we noticed our daily cronjob running sa-update and sa-compile was giving us an error concerning permissions: May 25 00:31:25.488 [8381] warn: bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: Permission denied bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: Permission denied I wonder where did these files come from. did you sety bayes_path in /etc/spamassassin/ ?
Re: Why does sa-compile access the bayes db?
We're using a global bayes_path defined in local.cf: use_bayes 1 use_bayes_rules 1 bayes_auto_learn 1 bayes_expiry_max_db_size 150 bayes_path /var/lib/spamassassin/bayes_db/bayes bayes_file_mode 0775 bayes_ignore_to spam-analy...@ulyssis.org bayes_ignore_from spam-analy...@ulyssis.org bayes_auto_learn_threshold_nonspam 0.1 bayes_auto_learn_threshold_spam 10.0 score BAYES_00 -0.001 -0.001 -0.001 -0.001 score BAYES_05 -0.001 -0.001 -0.001 -0.001 score BAYES_20 -0.001 -0.001 -0.001 -0.001 score BAYES_40 -0.001 -0.001 -0.001 -0.001 score BAYES_50 0.001 0.001 0.001 0.001 score BAYES_60 0.001 0.001 0.001 0.001 score BAYES_80 0.001 0.001 0.001 0.001 score BAYES_95 0.001 0.001 0.001 0.001 score BAYES_99 0.001 0.001 0.001 0.001 score BAYES_999 0.001 0.001 0.001 0.001 Currently we're still evaluating the amount of false positives (and contacting users who seem to have broken cronjobs that confuse bayes) before taking away the artificial scores. We wanted to clear up our sa-compile cronjob error. On 28/05/2020 10:18, Matus UHLAR - fantomas wrote: On 25.05.20 23:34, Bert Van de Poel wrote: Recently, we've been setting up Bayesian learning on our existing Amavis with Spamassassin setup on Ubuntu 18.04 (Spamassassin 3.4.2-0ubuntu0.18.04.3 and Amavis 1:2.11.0-1ubuntu1). We've decided to use a global db that was seeded with an aggregation of spam and ham we've received, then enabling autolearn to further train the set. As Spamassassin runs inside Amavis, the Bayes database files are owned by the amavis user. This setup works fine, and results for Bayes are great and growing in accuracy by autolearning. What was somewhat confusing is that we noticed our daily cronjob running sa-update and sa-compile was giving us an error concerning permissions: May 25 00:31:25.488 [8381] warn: bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: Permission denied bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: Permission denied I wonder where did these files come from. did you sety bayes_path in /etc/spamassassin/ ?
Re: Why does sa-compile access the bayes db?
Plugin initialization+journal sync would make a lot of sense. What would be the cleanest solution in that case? It's quite annoying to receive the same error mail every day. Should we use --cnf to disable the bayes plugin, or is there a more elegant solution? Should we file a bug about this? On 26/05/2020 00:45, RW wrote: On Mon, 25 May 2020 23:34:27 +0200 Bert Van de Poel wrote: My question therefore specifically is: what exactly does sa-compile do to the bayes database files? I don't know for sure, but it's probably just a side-effect of initializing plugins. Possibly it's trying to perform an opportunistic sync on the journal file. sa-compile doesn't need to access Bayes, so you could just treat it as a cosmetic error. I wouldn't change ownership or permissions just for this.
Why does sa-compile access the bayes db?
Dear Spamassassin users and developers, Recently, we've been setting up Bayesian learning on our existing Amavis with Spamassassin setup on Ubuntu 18.04 (Spamassassin 3.4.2-0ubuntu0.18.04.3 and Amavis 1:2.11.0-1ubuntu1). We've decided to use a global db that was seeded with an aggregation of spam and ham we've received, then enabling autolearn to further train the set. As Spamassassin runs inside Amavis, the Bayes database files are owned by the amavis user. This setup works fine, and results for Bayes are great and growing in accuracy by autolearning. What was somewhat confusing is that we noticed our daily cronjob running sa-update and sa-compile was giving us an error concerning permissions: May 25 00:31:25.488 [8381] warn: bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: Permission denied bayes: cannot write to /var/lib/spamassassin/bayes_db/bayes_journal, bayes db update ignored: Permission denied While this makes a lot of sense, considering that the files are owned by the amavis user, we were quite surprised this cronjob would need to access these files in the first place. Looking further into the issue, we figured out it was specifically sa-compile, and the specific message probably originated from /usr/share/perl5/Mail/SpamAssassin/BayesStore/DBM.pm. While I have some programming experience, I was sadly unable to understand this Perl file enough to properly comprehend why this code was accessing bayes_journal and what it was planning to do there. My question therefore specifically is: what exactly does sa-compile do to the bayes database files? I've asked this same question on IRC but was unable to get an answer. While a fix for this issue changing permissions and user/group ownership is rather obvious, we'd first want to understand what sa-compile is up to. Kind regards, Bert Van de Poel ULYSSIS
Custom rule aware of occurrences
Dear fellow Spamassassin users, I'm contacting you as a member of ULYSSIS. ULYSSIS is a student non-profit organisation at the University of Leuven trying to make computers and technology more approachable and available to students. As part of this objective, we run a hosting service within our university's network for student organisations, student unions and individuals at our university. We've battled with spam from time to time, since we seem to attract a lot of exotic languages which are rather well able to circumvent commonly used methods. This has had us resort to some custom rulesets to battle against mostly targetted French and SEO spam often coming from very respectable servers and very normal addresses. Now because SEO spam specifically has been adapting quite well to any rule we think of (finding alternative ways of saying the same thing time and time again), I was hoping to write a rule that basically boiled down to "give some spam score to emails that contain the word SEO 3 or more times" to push those already being detected by other rules over the edge. To be clear, this will be a low score rule, I'm aware that ham can perfectly well contain that word 3 times, just like this email for example. Now while investigating I started wondering how to tackle that some spam will just have a plain text body, while others will also feature HTML, which means that suddenly the amount may double/half. Beyond that it seems quite hacky to use a regex that boils down to something like /\bSEO\b.*\bSEO\b.*\bSEO\b/i instead of something that is properly aware of the count of certain words. Since I sort of expected Spamassassin to have a solution for both the text/text+html and the counting problems, I asked around on IRC but was pointed here. So uhm, any suggestions or pointers are more than welcome. Not too sure if any more information is required, but feel free to ask questions or corect my presumptions if necessary. Kind regards, Bert Van de Poel ULYSSIS University of Leuven