RE: SORBS bites the dust

2009-06-23 Thread Jeff Moss
On Mon, 22 Jun 2009, Arvid Picciani wrote:
 rich...@buzzhost.co.uk wrote:
  It comes with great sadness that I have to announce the imminent
  closure of SORBS.
 crap ...  sorbs is the only list I trust enough to have them at SMTP level.

In the past, I did some tests to determine which lists caught the most
spam without FP's, and found that sbl-xbl.spamhaus.org (not the full
'zen' rbl), was catching over 90% of spam. I also use njabl, though
lately it looks like it mostly overlaps with spamhaus, but the 'web' and
'dul' lists from sorbs are still catching a couple of 100 spam each day
that were not caught by spamhaus. So I would really hate to see SORBS go.

IMPORTANT: If sorbs does not get picked-up by a new host, will SA
developers be ready to roll-out an SA update to remove the sorbs rules, so
that we don't suffer a bunch of timeouts? Or how does that work?

- Charles
 
WHAT?  Sorbs and Spamhaus are polar opposites.  Spamhaus is a great
organization while SORBS is a POS that helped give all blacklists a bad name.
I don't know if SpamAssassin has ever used it. 
 
  Jeff Moss




RE: my emailBL is live!

2009-05-01 Thread Jeff Moss
 The chance of a collision really is much smaller than I thought, even
 including the birthday paradox.  But rather than just say it's small and
 ask you to take my word for it I'm providing a link.  The Wikipedia page
 for Birthday Attack has a chart that shows the probability of collision
 for hashes of various lengths.

 http://en.wikipedia.org/wiki/Birthday_attack

Well nuts.  Unless my estimation is wrong, my half-length MD5sum would
be 64-bit and thus the 10^-18 probability of collisions would require
a db of 190 entries rather than full-length MD5sum's 820 billion.

Unless corrected, I'll revise my algorithm this evening.

Well, a 64-bit hash with a 10^-18 probability of collisions would only require 
6 entries in the DB.  However a 10^-12 probability should be good enough 
because there probably aren't a trillion unique email addresses.  A 10^-12 
probability of collision would allow 6 million entries in the DB.
 
This is not to suggest that I ever understood the part about using half-length 
MD5.

  Jeff Moss




RE: my emailBL is live!

2009-04-30 Thread Jeff Moss
Rob McEwen wrote:

 A word of caution.  Be very careful how you use the list.

 OK. I was wrong. Due to this discussion, I'm convinced that MD5 of the
 whole (lower case!) e-mail address is best, with the entire e-mail
 address still showing up in plain text in the DNS txt record.

 But I have some questions:

 (1) is MD5 of the entire address reasonably safe from collisions.
 (consider the 'birthday paradox' before being too quick to answer)

Yes. The chance of a collision is ridiculously small. Not worth worrying
about.

The chance of a collision really is much smaller than I thought, even including 
the birthday paradox.  But rather than just say it's small and ask you to take 
my word for it I'm providing a link.  The Wikipedia page for Birthday Attack 
has a chart that shows the probability of collision for hashes of various 
lengths.

http://en.wikipedia.org/wiki/Birthday_attack

  Jeff Moss





RE: Trying out a new concept

2008-09-23 Thread Jeff Moss
This will actually work.  I've been involved in a university experiment doing 
this for over a year now.  Simply put, trying to create a list of new spammer 
domains is a count to infinity problem.  Creating a list of old domains is 
not.
 
  Jeff Moss



From: Marc Perkel [mailto:[EMAIL PROTECTED]
Sent: Mon 9/22/2008 5:49 PM
To: users@spamassassin.apache.org
Subject: Trying out a new concept



I don't know how this will work but I'm building the data now. For those
of you who are familiar with Day old bread lists to detect new domains,
as you know there's a lag time in the data and they often don't have
data from all the registries. So - here's a different solution.

What I'm thinking is to accumulate every domain name that interacts with
my system and storing it in a list. Eventually after a week or so I
should have a good list. Then the idea is to do a lookup to see if a new
domain is NOT on the list. This will catch all really new domains, but
will have some false positives. But - if it is mixed with other
conditionals it might be a good way to detect and block spam from or
linking to tasting domains.

Thoughts?





RE: Measuring the world's biggest email domains (fwd)

2008-04-29 Thread Jeff Moss
 
 I am the chairman of a German eco working group about 
 Sender-Authentication (http://www.eco.de/arbeitskreise/sauth.htm), in 
 this context I started
http://www.agitos.de/dkim-reputation-project.html 
 which reveals interesting results: especially the blocking of single 
 spammer accounts coming from ISP mail domains is an interesting +.

I'm very surprised to see this.  I've been working on a similar
project at UNC-Charlotte since August of 2007, and I had not
heard of its existence.  Our project is called RepuScore.

http://isr.uncc.edu/RepuScore/

I'm an undergraduate computer science student attached to RepuScore
as my required senior project.  I work as a coder and data collector
for PhD Candidate Gautam Singaraju, and Dr. Brent Kang, who are the 
main researchers.  We have published several well received papers at
conferences in the US.

It has become clear to me that reputation for authenticated
domains is the next big weapon in the fight against spam.  The only
remaining uncertainty is who will have the first and/or best deployment.

  Jeff Moss


RE: Configuring SA as frontend to Exchange

2008-04-10 Thread Jeff Moss
I've done this a few times and it works really well.  I use Linux,
Postfix,
SpamAssassin, ClamAV, and a super lightweight cut down version of the
now-dead
Amavisd-lite.

I use this system as an inbound email relay on, or outside, the
corporate
firewall boundary and put Exchange inside.  That way if the relay system
comes under attack Exchange still works correctly inside the firewall. I
configure the firewall to only allow my outside email relay to send
inbound
traffic to the Exchange server.

In order to avoid accepting email for users I don't have I've got a
script
that reaches into the Active Directory through LDAP and gets a list of
legal
email recipients every day.  I found the script on the net somewhere but
I had to tweak the output a little to make Postfix happy.

  Jeff Moss

 

-Original Message-
From: Henry Kwan [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 09, 2008 2:24 PM
To: users@spamassassin.apache.org
Subject: Configuring SA as frontend to Exchange


Hi,

Have been running SA on CentOS for a few years now and everything has
been
working great.  But the powers that be want to move to Exchange so I am
trying
to plan a SA frontend that feeds the Exchange server.

As I was thinking over how SA works now and how it might work in the my
future
setup, I was wondering how you would feed unmarked spam to the SA
frontend? 
Since email is passed through to Exchange, it isn't stored on the SA
server
anymore like it is now.  Or would I be limited to just having SA
autolearn?

Also, if anyone has some good links to setting up a SA frontend to
Exchange,
that would be much appreciated.

Thanks!




Re: Problem logging from SA when running Amavisd

2007-09-24 Thread Jeff Moss

 Jeff,

  What I was hoping to do was write stuff to the log file for a week
or
  two using the info() method.  Then I could grep out my lines, get
  the data  analyzed, and then finish the plugin.
 
  I am a fairly experienced programmer but I have not used object
oriented
  Perl before.  Thankfully it doesn't seem that different from other
OO
  languages. Anyway I don't mind hacking up a temporary version of
  Amavisd if you could tell me how to get SA to quit logging to
STDERR.

 Ok, here is a patch to amavisd-new 2.5.2 (works with SA 3.2.3), which
 achieves what you need, or at least should get you going. It hooks
 its own logging module into SpamAssassin, so it receives all logging
 from SA. It maps SpamAssassin log levels into amavisd log levels
 (which in turn are mapped into syslog priorities), so you should
 be seing for example SA 'info:' at amavisd log level 1, and 'dbg:'
 at log level 5 (so you must have $log_level=5 in order to see dbg:).
 Change the mapping to taste if you like.

---snip snip snip--

Thanks for the patch Mark.  I'll put it in production tomorrow.  Could
you please take a minute to explain the underlying issue to me.  I don't
understand why SA does not log without the patch.  Is SA intentionally
logging to STDERR, or is Amavisd's connection to syslog causing SA to
loose it's connection.

  Jeff Moss


Re: Problem logging from SA when running Amavisd

2007-09-21 Thread Jeff Moss
 When SpamAssassin is invoked by amavisd, the SA debug log goes to
STDERR.
 There is currently no configurable way to let amavisd hook into SA
logging
 and capture its output, although it is doable and on a TODO list.

 For the moment you can redirect STDERR to a file and let it running
 for a while for diagnostic purposes, e.g.:

   Mark


What I was hoping to do was write stuff to the log file for a week or
two
using the info() method.  Then I could grep out my lines, get the data
analyzed, and then finish the plugin.  (I'm not the PhD in this
operation
I'm just an undergraduate.)

I am a fairly experienced programmer but I have not used object oriented
Perl
before.  Thankfully it doesn't seem that different from other OO
languages.
Anyway I don't mind hacking up a temporary version of Amavisd if you
could
tell me how to get SA to quit logging to STDERR.

  Jeff Moss 


Problem logging from SA when running Amavisd

2007-09-20 Thread Jeff Moss
I'm working on a SpamAssassin plugin for a university research project.
I've debugged a lot of it by running SpamAssassin from the command line,
and using the SA logger's dbg() and info() methods to output stuff.

Now I need to put it in a production server and see the same debug
information in the log file.  The production servers run Amavisd.
The problem is that I can't get any output to the log file.  The only
lines in the log file come from Amavisd, and not from SA itself.

I noticed that SpamAssassin uses Sys::Syslog and Amavisd uses
Unix::Syslog so I thought that might be the reason.  After a
little more searching I found that Sys::Syslog requires you to
run syslogd with a -r qualifier.  So I edited the config file for
syslogd and restarted it, but it didn't help.

I know this is the SA users list, not the developers list, but most
plugins are written by users not developers, so that's why I asked here.
Can anyone help me with this?

  Jeff Moss


Razor2 servers are down?

2007-02-01 Thread Jeff Moss
-Original Message-
From: Roman Serbski [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, January 30, 2007 11:52 PM
To: users@spamassassin.apache.org
Subject: Razor2 servers are down?

Hi list,

It looks like Razor2 servers are down? SA stopped checking messages
for me this morning with the following in razor-agent.log:

[ 3] Unable to connect to c124.cloudmark.com:2703; Reason: Operation
now in progress.
[ 3] Unable to connect to c125.cloudmark.com:2703; Reason: Operation
now in progress.
[ 3] Unable to connect to c121.cloudmark.com:2703; Reason: Operation
now in progress.
[ 3] Unable to connect to c118.cloudmark.com:2703; Reason: Operation
now in progress.

As soon as I disable Razor2 in local.cf and restart spamd things
started working as they should (with no Razor2 support though).

Thanks.
Roman
--
Razor is working fine for me here at HuffmanCorp.com

  Jeff Moss


RE: SPF is hopelessly broken and must die!

2006-12-14 Thread Jeff Moss
 Why was this topic not started on the SPF list? Was the original
poster of
 this topic looking to get MORE attention on the SpamAssassin list?

I was wondering the same thing.  This list was once useful for people
maintaining SA installations but now at least half the traffic is
useless. 

  Jeff Moss


I've got TORA.08 spelled with numbers?

2006-11-17 Thread Jeff Moss
I'm getting a bunch of spams this morning that have
TORA.08 spelled out with numbers like this.

4216775   0611576   215556 7 3308011   3258576
   6  7 5   153 85 2   7 3
   8  3 6   50   4   1   2 7   0 5
   7  2 2   257873  5 7  4 1   3387715
   6  2 5   7  1   111500075 8 6   2 2
   8  2 2   7   7  3   2   656   0 3   0 8
   0  6430533   44 8   6   207   5412501   7637213


Does anybody know what this is about.

  Regards
  Jeff Moss


OCR scanner still causing SA to crash

2006-08-04 Thread Jeff Moss
The OCR scanner is still causing SA to crash sometimes even though I've
got everything patched properly.  I can't figure out what the problem
is.
When I use giftopnm and gocr on the offending images from the command
line
I don't get any errors.  I guess I'll turn it off until Monday when I
can
work on it again.

  Jeff Moss 

-Original Message-
From: Davin Flatten [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 03, 2006 4:34 PM
To: Jeff Moss; users@spamassassin.apache.org
Subject: Re: GIF Spam -- Setting up the 'OCR scanner and image validator
SA-plugin'

Jeff-

You might also want to see if you copy the message out of a client 
application like Thunderbird and then copy the image to your server and 
running giftopnm on it.  It might be that uudeview is the problem and 
not giftopnm.  The errors sounds like a corrupt gif image.  This should 
not effect the plugin however.

I would suggest turning on debugging output on Spamassassin to see where

in the plugin the problem is occurring. Use the facility 'ocrtext' to 
and grep your logs for 'ocrtext'.  Should give you some info.

If you running spamd try:  --debug=ocrtext

-D, --debug[=areas]Print debugging messages (for areas)

Hope this helps.
-Davin


RE: GIF Spam -- Setting up the 'OCR scanner and image validator SA-plugin'

2006-08-03 Thread Jeff Moss
We're getting some image-spam stuck in the queue because they crash SA
with this plugin turned on. We are using a custom setup built from
amavisd-lite.
I'm still trying to figure out what's causing it.

  Jeff Moss 

-Original Message-
From: Stuart Johnston [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 03, 2006 10:41 AM
To: users@spamassassin.apache.org
Subject: Re: GIF Spam -- Setting up the 'OCR scanner and image validator
SA-plugin'

Davin Flatten wrote:
 Just thought this might help someone out.  Thanks to M. Blapp for an 
 excellent SA Plugin.  Optical Character Recognition (OCR) can be used
to 
 nab those pesky spam messages that are hidden in gif,jpeg, or png
images...

This OCR stuff looks promising.  Any comments on performance?  How much
extra load does it put on a 
server?



RE: GIF Spam -- Setting up the 'OCR scanner and image validator SA-plugin'

2006-08-03 Thread Jeff Moss
Still trying to debug SA crashing with the OCR plugin.  I extracted the
base64 encoding from one of the offending messages.  Then I converted it
to image001.gif with uudeview.  But when I try to convert it to a pnm
file from the command line I get errors.

[filter]# giftopnm image001.gif  image001.pnm
giftopnm: too much input data, ignoring extra...
giftopnm: bogus character 0x00, ignoring
[filter]#

I have no idea what's causing this, how to fix it, or if it's even
related to the crashing problem.

  Jeff Moss


-Original Message-
From: Stuart Johnston [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 03, 2006 10:41 AM
To: users@spamassassin.apache.org
Subject: Re: GIF Spam -- Setting up the 'OCR scanner and image validator
SA-plugin'

Davin Flatten wrote:
 Just thought this might help someone out.  Thanks to M. Blapp for an 
 excellent SA Plugin.  Optical Character Recognition (OCR) can be used
to 
 nab those pesky spam messages that are hidden in gif,jpeg, or png
images...

This OCR stuff looks promising.  Any comments on performance?  How much
extra load does it put on a 
server?



RE: GIF Spam -- Setting up the 'OCR scanner and image validator SA-plugin'

2006-08-03 Thread Jeff Moss
Patching GIF.pm seems to have fixed the problem.  I patched gocr because
that was in the instructions that got posted, but patching GIF.pm wasn't
so I missed it.

  Jeff Moss

-Original Message-
From: Davin Flatten [mailto:[EMAIL PROTECTED] 
Sent: Thursday, August 03, 2006 3:54 PM
To: Jeff Moss
Cc: users@spamassassin.apache.org
Subject: Re: GIF Spam -- Setting up the 'OCR scanner and image validator
SA-plugin'

Jeff-

Make sure you apply the patches to both the gocr source and 
Image::ExifTool.   The gocr patch deals specifically with the segfault 
issues.

 From the docs:

# - Perl module Image::ExifTool and a patch for GIF pics:
#   http://antispam.imp.ch/patches/patch-GIF-Colortable
#
# - Gocr from http://jocr.sourceforge.net and a patch to
#   avoid segfaults with gocr:
#   http://antispam.imp.ch/patches/patch-gocr-segfault


Hope this helps.
-Davin