Re: message/rfc822 to mbox script for use with sa-learn workflow

2017-08-14 Thread Ian Zimmerman
On 2017-08-14 20:08, Scott wrote:

> I would like to turn around and put those individual messages back
> into mbox format, again, without changing their original headers.

The first question is: why?  sa-learn works on just about any format:
individual messages, multiple messages in a flat directory, maildirs.

If in spite of the above you _must_ have a mbox file, I would just setup
a trivial procmail config (maybe even an empty one, supplemented with
one or two environment variables including DEFAULT) and pipe the
messages through procmail one by one.

You probably need the -f option to force generation of the From_ mbox
delimiter.

-- 
Please don't Cc: me privately on mailing lists and Usenet,
if you also post the followup to the list or newsgroup.
Do obvious transformation on domain to reply privately _only_ on Usenet.


Re: Operators Blacklist Survey

2017-08-14 Thread Bill Cole

On 14 Aug 2017, at 18:00, Shivram Krishnan wrote:


Hi,


I am a graduate student at the University of Southern California and 
am

currently researching on the impact of false positives in blacklists.


Apparently they don't bother with a mandatory Research Methodology 
course for grad students any more. That's disappointing.



I am
aware that spamassassin uses blacklists in its rule based system to 
stop
spam messages. But since it is a rule based system, even if there are 
false

positives in blacklists, there may be other rules which can influence
spamassassin to mark it correctly. There are several other blacklists 
which

are used to stop different attacks (eg phishing, DDoS, malware hosting
etc). I was wondering if operators in general use external
blacklists(uribl, spamhaus, spamcop etc) in the form of rule based 
system
(like spamassassin) or use it outrightly to block all IPs listed in 
them.


Asking that question HERE assures that you will get a badly skewed 
sample.


The majority of SA users do not read this list. The majority of email 
admins do not use SA. Many who do use DNSBLs don't understand that they 
do so, because the mail filtering is in a box they were told they never 
need to touch or is done externally by a filtering provider who won't 
tell customers what they use. A very large fraction of legitimate mail, 
possibly a majority, flows between and within a few large providers who 
do not use SA, may or may not cooperate with and/or use publicly 
available DNSBLs, and will never admit to using anything other than 
their own tools for spam filtering.


It will be great if you can take this four question survey, which can 
help

me understand the usage of blacklists by operators.


Unfortunately my current answers would be very unusual, because I 
recently lost the job where I actively managed mail systems for pay, and 
the micro-systems I manage for myself and friends who ask for help are 
tiny and ridiculously unrepresentative.


But no matter, I'll act like I still have that job or the one before it 
or any of the others I've had managing mail systems in the age of 
DNSBLs.



The survey consists of
these questions -
1) The size of the network(s) you manage(in terms of customers)


That is confidential and proprietary business information which I am not 
authorized to share.



2) List of external blacklists used.


That is confidential and proprietary business information which I am not 
authorized to share.



3) How these blacklists are used? whether in a rule based system or
outrightly blocked or both


That is confidential and proprietary business information which I am not 
authorized to share.


4) If external blacklists are used in a non-rule based system, how do 
you

overcome false positives?


That is confidential and proprietary business information which I am not 
authorized to share.


I expect that a large percentage of professional email admins would 
answer identically. I would not recommend trusting any who answered 
substantively.


I would also recommend against sharing this message with your faculty 
advisor. Some questions cannot be answered accurately or meaningfully by 
taking surveys of those willing to answer. Spam control is an 
operational security facility. People doing it who understand their jobs 
will not discuss the details.





Re: message/rfc822 to mbox script for use with sa-learn workflow

2017-08-14 Thread Scott
Maybe not rf822 format.  This is a sample extracted single file:
https://pastebin.com/S9W4Z64N





--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/message-rfc822-to-mbox-script-for-use-with-sa-learn-workflow-tp138362p138363.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


message/rfc822 to mbox script for use with sa-learn workflow

2017-08-14 Thread Scott
I have a script that can take spam/ham messages forwarded as attachments from
Outlook and turn them into rfc822 individual files.  It allows external
users to send me Outlook spam/ham for review.  I will in turn feed sa-learn
with those messages once vetted.  That part of the process is getting me the
messages in-tact as far as I can tell, as the user received them.  I could
pipe those messages to sa-learn directly; that's what the script is designed
to do.  But I don't trust the user's submissions, and prefer to review
first.  FYI, the script that handles the separation of the attachments is
from here:
http://www.localside.net/sal-wrapper/

I would like to turn around and put those individual messages back into mbox
format, again, without changing their original headers.  Anyone have a
script or a method which will accomplish that?  I tried to figure out how to
do it but was unsuccessful.










--
View this message in context: 
http://spamassassin.1065346.n5.nabble.com/message-rfc822-to-mbox-script-for-use-with-sa-learn-workflow-tp138362.html
Sent from the SpamAssassin - Users mailing list archive at Nabble.com.


Operators Blacklist Survey

2017-08-14 Thread Shivram Krishnan
Hi,


I am a graduate student at the University of Southern California and am
currently researching on the impact of false positives in blacklists. I am
aware that spamassassin uses blacklists in its rule based system to stop
spam messages. But since it is a rule based system, even if there are false
positives in blacklists, there may be other rules which can influence
spamassassin to mark it correctly. There are several other blacklists which
are used to stop different attacks (eg phishing, DDoS, malware hosting
etc). I was wondering if operators in general use external
blacklists(uribl, spamhaus, spamcop etc) in the form of rule based system
(like spamassassin) or use it outrightly to block all IPs listed in them.

It will be great if you can take this four question survey, which can help
me understand the usage of blacklists by operators. The survey consists of
these questions -
1) The size of the network(s) you manage(in terms of customers)
2) List of external blacklists used.
3) How these blacklists are used? whether in a rule based system or
outrightly blocked or both
4) If external blacklists are used in a non-rule based system, how do you
overcome false positives?

The link to the survey is below -

https://docs.google.com/forms/d/e/1FAIpQLSe-hgYD-ifkFMyPHrqYL0b7jAkbWjOKiAQjh-oI4mYeiVQg2g/viewform


Shivram


Re: TxRep can't use SQLBasedAddrList factory module

2017-08-14 Thread Kevin A. McGrail

On 8/13/2017 8:49 AM, Christopher Engelhard wrote

Log:
spamd[8299]: TxRep: illegal factory setting
spamd[8299]: TxRep: could not open storages, quitting!

Config:
header   TXREP   eval:check_senders_reputation()
describe TXREP   Score normalizing based on sender's
reputation
tflags   TXREP   userconf noautolearn
priority TXREP   1000
txrep_factory module Mail::SpamAssassin::SQLBasedAddrList
user_awl_dsn DBI:mysql:spamdb:localhost
user_awl_sql_username
user_awl_sql_password
user_awl_sql_table   txrep



Off the cuff, it looks fine to me.

does mysql -u  -p localhost spamdb work?