why is WHOIS_DMNBYPROXY scoring on _my_ domain?

2007-03-13 Thread snowcrash+spamassassin

in emails sent TO me ([EMAIL PROTECTED]), i'm noting SA scores of,

*  0.5 WHOIS_DMNBYPROXY Contains URL registered to Domains by Proxy
*  [URIs: mydomain.com]

now, mydomain.com *IS*, in fact, reg'd @ Domains by Proxy ...
legitimately.  but, why is it scoring on _MY_ domain ?

no doubt i've misconfigured something in local.cf :-/

suggestions?

thanks.


Re: why is WHOIS_DMNBYPROXY scoring on _my_ domain?

2007-03-13 Thread snowcrash+spamassassin

Perhaps I mistated my question.

Why is this triggering/scoring on *MY* domain on *INBOUND* email.

I can understand if it's triggering on the sender's DBP registration
-- but it's triggering, again, on _mine_.


Re: why is WHOIS_DMNBYPROXY scoring on _my_ domain?

2007-03-13 Thread snowcrash+spamassassin

you can use uridnsbl_skip_domain
to cause your domain to not be checked against uri blacklists.


that's what i was looking for.

thanks.


sa-compile usage questions

2007-03-11 Thread snowcrash+spamassassin

some questions about sa-compile usage:

(1) how do we verify that the compiled rules are working? is a
'healthy' --lint sufficient?

(2) how do/should we meaure the improved (hopefully) performance due
to the compiled rules?

(3) do compiled rules automatically take precedence over uncompiled
rules? or, must we remove the uncompiled rules from the rule path?
i'm guessing it's done auto-magically ...

(4) should (must?) we run sa-compile at every rule update?  e.g., if
we're sa-update'ing once/hr, running sa-compile hourly is more than a
bit cpu intensive.

(5) does sa-compile detect diffs/changes between runs and only compile
changes? or does it recompile ALL available rules each time?

thanks.


random/occassional permission denied failures in sa-update

2007-02-26 Thread snowcrash+spamassassin

at seemingly random intervals, sometimes after days of working just
fine with no errors, and with no extraordinary actions on my part,
sa-update will fail with:

channel: attempt to rm channel cf file failed, attempting to continue
anyway at /usr/local/spamassassin/bin/sa-update line 742.
error: can't remove file
/var/mail/spamassassin/updates/3.002000/updates_spamassassin_org/10_default_prefs.cf:
Permission denied
channel: attempt to rm channel directory failed, attempting to
continue anyway at /usr/local/spamassassin/bin/sa-update line 745.
error: failed to open
/var/mail/spamassassin/updates/3.002000/updates_spamassassin_org/10_default_prefs.cf
for write: Permission denied at /usr/local/spamassassin/bin/sa-update
line 993.
channel: archive extraction failed, channel failed
error: can't remove file
/var/mail/spamassassin/updates/3.002000/updates_spamassassin_org/10_default_prefs.cf:
Permission denied
channel: attempt to clean up failed extraction also failed!

suggestions as to why this is happening, and what to do about it?

thanks.


v320/trunk r511659 sa-compile hangs ...

2007-02-25 Thread snowcrash+spamassassin

follow-on to the fix in,

   http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5340

building r511659, on,

   spamassassin/bin/sa-compile --sudo -D


i see lots of output, including many instances of,

  [13689] dbg: generic: giving up on that direction: brace mismatch
in '7(02' at 
/usr/local/lib/perl/sitelib/Mail/SpamAssassin/Plugin/BodyRuleBaseExtractor.pm
line 516.

then, the whole process hangs @,


  [13689] dbg: zoom: NO
  /(?!guarantee)[gk6]{1,2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'+-]
  ?[uv\xb5\xd9\xda\xdb\xdc\xfc\xfb\xfa\xf9\xfd]{1,2}[\s\d_*\$\%(),.
  :;?!}{\[\]|\/?^#~\xa1`'[EMAIL PROTECTED]
  \xe3\xe2\xe0\xe1\xe2\xe3\xe4\xe5\xe60o]{1,2}[\s\d_*\$\%(),.:;?!}{
  \[\]|\/?^#~\xa1`'+-]?r{1,2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'
  [EMAIL PROTECTED]
  xe3\xe4\xe5\xe60o]{1,2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'+-]?
  [n\xd1\xf1]{1,2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'+-]?[t|]{1,
  2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'+-]?[e3\xc8\xc9\xca\xcb\
  xe8\xe9\xea\xeb\xa4]{1,2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'+-
  ]?[e3\xc8\xc9\xca\xcb\xe8\xe9\xea\xeb\xa4]{1,2}/i [13689] dbg:
  zoom: NO
  /\b(?!investor)[ilt|!1y?\xcc\xcd\xce\xcf\xec\xed\xee\xef]{1,2}[\s
  \d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'+-]?[n\xd1\xf1]{1,2}[\s\d_*\$\
  %(),.:;?!}{\[\]|\/?^#~\xa1`'+-]?(?:[vu]|\\\/){1,2}[\s\d_*\$\%(),.
  :;?!}{\[\]|\/?^#~\xa1`'+-]?[e3\xc8\xc9\xca\xcb\xe8\xe9\xea\xeb\
  xa4]{1,2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'+-]?[sz5\xa6\xa7]{
  1,2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1`'+-]?[t|]{1,2}[\s\d_*\$\
  %(),.:;?!}{\[\]|\/?^#~\xa1`'+-]?[go0\xd2\xd3\xd4\xd5\xd6\xd8\xf0\
  xf2\xf3\xf4\xf5\xf6\xf8]{1,2}[\s\d_*\$\%(),.:;?!}{\[\]|\/?^#~\xa1
  `'+-]?r{1,2}/i

and goes no further ...  ctrl-C is req'd to exit.


Re: Crooked JPG's not being recognized by FuzzyOCR?

2007-02-25 Thread snowcrash+spamassassin

Do these hit for anyone else?


fwiw, it scores 6.000 for me,

2007-02-25 17:10:33 [21699] JPEG: [360x491] crookedjpg.jpg (55507)
2007-02-25 17:10:33 [21699] Found: 1 images
2007-02-25 17:10:33 [21699] Found JPEG header name=crookedjpg.jpg
2007-02-25 17:10:33 [21699] Calculating image hash for:
/tmp/.spamassassin21699JCCwwAtmp/crookedjpg.jpg.pnm
2007-02-25 17:10:35 [21699] Scanset Order: ocrad(0) ocrad-invert(0)
ocrad-decolorize-invert(0) ocrad-decolorize(0) gocr(0) gocr-180(0)
2007-02-25 17:10:37 [21699] Scanset ocrad found word target with
fuzz of 0.1667
 line: r targe ln ss
2007-02-25 17:10:37 [21699] Scanset ocrad found word company with
fuzz of 0.2857
 line: compnriewilo rerces inc coder oc  w e s  p k 
2007-02-25 17:10:37 [21699] Scanset ocrad found word target with
fuzz of 0.1667
 line: rtargelnss
2007-02-25 17:10:38 [21699] Scanset ocrad found word company with
fuzz of 0.2857
 line: compnriewilorercesinccoderocwespk
2007-02-25 17:10:45 [21699] Scanset ocrad-decolorize found word
target with fuzz of 0.1667
 line: r targe ln ss
2007-02-25 17:10:46 [21699] Scanset ocrad-decolorize found word
company with fuzz of 0.2857
 line: compnriewilo rerces inc coder oc  w e s  p k 
2007-02-25 17:10:46 [21699] Scanset ocrad-decolorize found word
target with fuzz of 0.1667
 line: rtargelnss
2007-02-25 17:10:46 [21699] Scanset ocrad-decolorize found word
company with fuzz of 0.2857
 line: compnriewilorercesinccoderocwespk
2007-02-25 17:10:48 [21699] Scanset gocr-180 found word trade with
fuzz of 0.2000
 line: clorgomadtpbak
tareniegtosglattntwnrdgmwnnilcinstruserbsyj abmricwargahjc
cnohwerwoitc   w rta k pi c 
2007-02-25 17:10:48 [21699] Scanset gocr-180 found word trade with
fuzz of 0.2000
 line:
clorgomadtpbaktareniegtosglattntwnrdgmwnnilcinstruserbsyjabmricwargahjccnohwerwoitcwrtakpic
2007-02-25 17:10:48 [21699] Message is spam, score = 6.000
2007-02-25 17:10:48 [21699] Adding Hash to
/var/mail/spamassassin/local/FuzzyOcr.db with score 6.000
2007-02-25 17:10:48 [21699] Words found:
 target in 1 lines
 company in 1 lines
 target in 1 lines
 company in 1 lines
 (4 word occurrences found)


Re: v318/trunk v320/trunk showing different header displays on FuzzyOCR test

2007-02-22 Thread snowcrash+spamassassin

an additional test, with a 'sent/recd' email, rather than just a file
test @ cmd_line, shows similarly,

with this image,

http://img181.imageshack.us/img181/2156/spamsc2.gif

attached to an otherwise blank email, on receipt, i see in FuzzyOCR.log,

 2007-02-22 14:22:57 [27803] Processing Message with ID
[EMAIL PROTECTED]
([EMAIL PROTECTED] - no receipients)
 2007-02-22 14:25:10 [6298] Processing Message with ID
[EMAIL PROTECTED] (SnowCrash
[EMAIL PROTECTED] - SnowCrash
[EMAIL PROTECTED])
 2007-02-22 14:25:10 [6298] GIF: [320x512] spam.gif (10195)
 2007-02-22 14:25:10 [6298] Found: 1 images
 2007-02-22 14:25:10 [6298] Found GIF header name=spam.gif
 2007-02-22 14:25:11 [6298] Image is single non-interlaced...
 2007-02-22 14:25:12 [6298] Calculating image hash for:
/tmp/.spamassassin6298Zhf5nItmp/spam.gif.pnm
 2007-02-22 14:25:12 [6298] Scanset Order: ocrad(0) ocrad-invert(0)
ocrad-decolorize-invert(0) ocrad-decolorize(0) gocr(0) gocr-180(0)
 2007-02-22 14:25:14 [6298] Scanset ocrad found word target with
fuzz of 0.
 line: target s
 2007-02-22 14:25:14 [6298] Scanset ocrad found word investor
with fuzz of 0.2500
 line:  fhe lncreasing inrest receilled br th liile gotwtg
 2007-02-22 14:25:14 [6298] Scanset ocrad found word breaking
with fuzz of 0.2500
 line:  fhe lncreasing inrest receilled br th liile gotwtg
 2007-02-22 14:25:22 [6298] Scanset ocrad-decolorize found word
target with fuzz of 0.
 line: target s
 2007-02-22 14:25:22 [6298] Scanset ocrad-decolorize found word
investor with fuzz of 0.2500
 line:  fhe lncreasing inrest receilled br th liile gotwtg
 2007-02-22 14:25:23 [6298] Scanset ocrad-decolorize found word
breaking with fuzz of 0.2500
 line:  fhe lncreasing inrest receilled br th liile gotwtg
 2007-02-22 14:25:23 [6298] Scanset gocr found word erectile with
fuzz of 0.2500
 line:  e increasln ingrest receiled hr j lirg ne  t u t  
 2007-02-22 14:25:23 [6298] Scanset gocr found word target with
fuzz of 0.
 line: target 
 2007-02-22 14:25:24 [6298] Scanset gocr found word erectile with
fuzz of 0.2500
 line: eincreaslningrestreceiledhrjlirgnetut
 2007-02-22 14:25:24 [6298] Scanset gocr found word buy with fuzz of 0.
 line: momemnsborqbuy
 2007-02-22 14:25:24 [6298] Scanset gocr found word target with
fuzz of 0.
 line: target
 2007-02-22 14:25:25 [6298] Scanset gocr-180 found word target
with fuzz of 0.
 line: target 
 2007-02-22 14:25:26 [6298] Scanset gocr-180 found word buy with
fuzz of 0.
 line: momemnsborqbuy
 2007-02-22 14:25:26 [6298] Scanset gocr-180 found word target
with fuzz of 0.
 line: target
 2007-02-22 14:25:26 [6298] Message is spam, score = 9.500
 2007-02-22 14:25:26 [6298] Adding Hash to
/var/mail/spamassassin/local/FuzzyOcr.db with score 9.500
 2007-02-22 14:25:26 [6298] Words found:
 erectile in 1 lines
 target in 1 lines
 erectile in 1 lines
 buy in 1 lines
 target in 1 lines
 (7.5 word occurrences found)


in the rec'd message's header, i see,

 ...
 X-Spam-Report:
   *  0.1 RDNS_NONE Delivered to trusted network by a host with no rDNS
   *  0.0 DK_POLICY_SIGNSOME Domain Keys: policy says domain signs some mails
   *  0.0 DKIM_POLICY_SIGNSOME Domain Keys Identified Mail: policy says domain
   *   signs some mails
   *  0.0 DK_SIGNED Domain Keys: message has a signature
   *  0.0 DKIM_SIGNED Domain Keys Identified Mail: message has a signature
   *  1.0 DC_IMG_TEXT_RATIO BODY: Low body to pixel area ratio
   *  0.0 BAYES_00 BODY: Bayesian spam probability is 0 to 1%
   *  [score: 0.0002]
   *  2.2 TVD_SPACE_RATIO BODY: TVD_SPACE_RATIO
   *  1.2 SARE_GIF_ATTACH FULL: Email has a inline gif
   *  9.5 FUZZY_OCR BODY:
 ...


*again*, with no header 'detail' for the FUZZY_OCR BODY header :-/

since i'm seeing the same 'missing header' biz on both,

(1) rec'd email proc'd via spamd running on my mailserver
(2) test file submitted to spamassassin via cmd line,

and, differing behavior for sa v318  v320, with the same version of
FuzzyOCR, i suspect this is a SA-related issue.

but if/what/where?

thanks.


[ot-ish] fuzzyocr still being developed?

2007-02-21 Thread snowcrash+spamassassin

following the numerous questions on list, i've gathered that fuzzyocr
is rather popular -- we use it, too.

i've not noticed recent bug-fixing, src dev (~ 1 month), or comments
here, from the dev.

just wondering -- is the proj still alive? dev vacation, maybe? or,
has the proj been subsumed _into_ SA when i wasn't looking?

thanks for any input/update.


Re: [ot-ish] fuzzyocr still being developed?

2007-02-21 Thread snowcrash+spamassassin

I think hes just busy.  AFAIK it is still being worked on.


if true, then certainly fair enough. thanks.

given that image-spam has become such a huge part of the battle, is
that a fuzzyocr should be _in_ the SA project/distribution.

i'm sure there are myriad reasons against it, not the least of which
is/may_be well-deserved developer pride, but as it stands for us at
the moment, two third party SA plugins,

  FuzzyOcr
  Botnet

are doing a disproportionate -- and effective! -- fraction of the
overall 'heavy-lifting' in our spam-defense.

just my $0.02 ...


v320, ASN plugin requires config, or not?

2007-02-21 Thread snowcrash+spamassassin

in 320.pre, re: ASN, i find,

# ASN - look up the Autonomous System Number of connecting IP
# requires additional configuration, see plugin's POD docs
# loadplugin Mail::SpamAssassin::Plugin::ASN

yet, in

man Mail::SpamAssassin::Plugin::ASN

i read,

CONFIGURATION
   This plugin has no user-serviceable parts or configurations.

so, what additional configuration is required?


v3.2-dev error if Rule2XSBody plugin loaded ...

2007-02-20 Thread snowcrash+spamassassin

hi,

intrigued by some of the forthcoming features of v3.2, i've built up a
test-isntance of,

spamassassin --version
SpamAssassin version 3.2.0-pre1-r499012
  running on Perl version 5.8.8

currently, on launch of sa, i see @ console,

[21402] error: Can't locate
Mail/SpamAssassin/CompiledRegexps/body_0.pm in @INC (@INC ...
/var/mail/spamassassin/updates/compiled/3.002000
/var/mail/spamassassin/updates/compiled/3.002000/auto) at (eval 536)
line 1.


this is with,

loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody

in init.pre

if i disable the compile plugin, on restart i get no errors.

known issue? my config?

thanks.


Re: v3.2-dev error if Rule2XSBody plugin loaded ...

2007-02-20 Thread snowcrash+spamassassin

hi,


that's to be expected until you actually run sa-compile to compile
the ruleset...


ah. i'd misunderstood (ok, presumed ...) that that was automatically
done ... thanks!

now,

% /usr/local/spamassassin/bin/sa-compile --sudo -D

[21503] dbg: logger: adding facilities: all
[21503] dbg: logger: logging level is DBG
[21503] dbg: generic: SpamAssassin version 3.2.0-pre1-r499012
[21503] dbg: config: score set 0 chosen.
[21503] dbg: dns: is Net::DNS::Resolver available? yes
[21503] dbg: dns: Net::DNS version: 0.59
sa-compile: cannot write to
/var/mail/spamassassin/updates/compiled/3.002000, aborting

checking,

ls -ald /var/mail/spamassassin/updates/compiled/3.002000
/usr/local/bin/ls: cannot access
/var/mail/spamassassin/updates/compiled/3.002000: No such file or
directory

then,

mkdir -p /var/mail/spamassassin/updates/compiled/3.002000
chown -R spam:spam /var/mail/spamassassin/updates/compiled/3.002000

and again,

% /usr/local/spamassassin/bin/sa-compile --sudo -D

still reports,

[21503] dbg: logger: adding facilities: all
[21503] dbg: logger: logging level is DBG
[21503] dbg: generic: SpamAssassin version 3.2.0-pre1-r499012
[21503] dbg: config: score set 0 chosen.
[21503] dbg: dns: is Net::DNS::Resolver available? yes
[21503] dbg: dns: Net::DNS version: 0.59
sa-compile: cannot write to
/var/mail/spamassassin/updates/compiled/3.002000, aborting

ownership/privileges?


Re: v3.2-dev error if Rule2XSBody plugin loaded ...

2007-02-20 Thread snowcrash+spamassassin

or possibly a bug :(  Worth opening a bug on bugzilla.  You could try
strace'ing the process to see exactly what it's seeing...


i'll open a bug, but i'm useless -- without a little guidance -- as to
what to do re: strace-ing, as i'm on a mac.

thanks.


Re: what does the 'new' --allowupdates option to sa-update do?

2007-02-15 Thread snowcrash+spamassassin

  since i certainly trust the project, and DOS' contributions, should i
  simply mod my cron jobs to,
 
  sa-update --allowplugins --channelfile .../DIST-channels.conf
  sa-update --allowplugins --channelfile .../SARE-channels.conf

 my understanding of Theo's comments is no you shouldn't do that.  My
 understanding of what he said was that none of the standard or SARE
 channels update plugins this way.

  From a security point of view you should not enable this by default, by
 doing that you would be leaving a wide open security hole, which could
 get compromised in the future.

 This switch is there for the rare occasion where you decide to allow a
 channel to update a plugin automatically.  This is something you would
 do only after reviewing that channel.

Yep -- I can't see any standard channel needing to use it.  Typically
if someone was to publish a channel that requires a certain custom
plugin, they would indicate that in the channel's documentation...


all clear, now, thanks!

still, would be nice to be able to verify -- using cmd line option --
what, if anyhting, the channel sa-update DID, in fact, 'send over'.
namely, did/does it install a plugin, in addition to any rules, even
IF disabled ...

thanks.


Re: what does the 'new' --allowupdates option to sa-update do?

2007-02-15 Thread snowcrash+spamassassin

Nope.  Neither include plugins, or other ways to load code, in their
channels.  If they were to in the future I'm sure there'd be some
attempt to make people aware of it.


got it. thanks!


 in the first case, its clear to trust ... but in the second (SARE)
 case, which channel/author am i actually trusting? DOS, SARE, others?

My involvement in the contents of the channels goes no further than you
trusting me to not have a setup that makes it easy (or even
likely/probable) to compromise the channels and that I'm reproducing the
same data available from the SARE website.  Beyond that I have no
involvement.  I do not audit existing or new ruleset channels (new ones
are created automatically).  Whatever SARE provides is what you get.  So
whatever mechanisms they have in place to ensure you can trust them is
what you're relying on (the same as if you were using RDJ or whatever to
get the rules directly from them).


_that_ is clear. again, thanks.

your 'facts' do provide an example, given the discussion about
'channel trust', and imho, of the lack of documentation/clarity on
determining that trust -- for/by just end-users.  which is, in part,
why, i presume, so many folks suggested (per theo) that the option be
turned OFF by default ...

innocently misunderstanding/enabling 'allowplugins' seems to have the
_potential_ to have some seriously nasty consequences -- i.e.,
exec'ing a plugin w/ root privs! -- if improperly config'd.  a bit
more dire than, say, mis-scoring a rule!

although i still think some sort of proactive check/report of a
channel's activity -- namely, DID it install a plugin ? -- would be a
good idea, gievn lack of response/interest to the idea, i'll guess
that it's over-(or, silly-) engineering.

then, at lease, some additional explanation, clarity,
skulls-n-crossbones, etc added to the manpage/docs/wiki would be
helpful. DOS's comments, above, are a good start, i think ...


Re: what does the 'new' --allowupdates option to sa-update do?

2007-02-15 Thread snowcrash+spamassassin

sa-update -D will tell you anything you want to know, such


there continues to be a belief -- not surprisingly by the 'experts' --
that debug output, rather than user-friendly output, is the answer to
all things.

1st -- and, yes it's my opinion, which i understand doesn't hold much
H2O -- is that there is a big difference between the two feedback
mechanisms.

2nd, in this silly case of 'allowplugins' -- as pointed out TO me
here, a 'mistake' in its application can end up with hostile root
executables in place. me? i think that's a bad thing, worthy of note 
clarity ...

but, no argument, sa-update -D provides the output.


You specifically asked the folks in here for recommendation. This of
course will be subjective...


yes. agreed. noone's arguing ... or criticising.  rather, i'm
acknowledging, and suggesting.


Anyway, an attempt on being not subjective:  Defaults are there for a
reason. Don't change a default, unless you fully understand the impact.


clear.

in this case, the *default* changed.  i didn't change the default. it
used to be ON, now it's OFF -- by default.

trying to fully understand the impact is exactly what i'm trying to do.

despite the seeming belief that, since others find it clear i should
too, iiuc, the purpose of the list _is_ to ask. bottom line, someone
ELSE changed the default, made mention of it in the changelog, didn't
explain it in a way that i understood -- which is my issue, not theirs
-- and so i asked.

now it's been answered.

thoroughly.

i've also added my $0.02 worth of request/suggestion. folks don't
agree -- at least enough to do anything about it.  that's fine.  i'll
have it documented internally for our own purposes.

everybody's happy.

thanks!  :-)


what does the 'new' --allowupdates option to sa-update do?

2007-02-14 Thread snowcrash+spamassassin

i note in 'Changes',

r503835 | felicity | 2007-02-05 19:30:00 + (Mon, 05 Feb 2007) | 1 line
bug 5240: disable plugins by default via sa-update unless new
--allowplugins option is specified


though i read the sa-update manpage,  read the commit here,

 http://www.gossamer-threads.com/lists/spamassassin/commits/92025

and, found nothing on the wiki, i'm unclear.

can someone explain why this is important?  what it does do for me?

as this is a new/recent change, does the addition of this option
TOGGLE any previously default functionality?

what do i need to do/change in order to keep my functionality 'as before'?

fwiw, my current sa-update cron job, that's been working fine until now, is,

 sudo -u spamassassin /usr/local/spamassassin/bin/sa-update \
 --channelfile /var/spamassassin/sa-update-channels.conf \
 --gpghomedir /var/security/gpg-homedir \
  /dev/null

do i need to change it to not 'lose' any capability?

thanks.


Re: what does the 'new' --allowupdates option to sa-update do?

2007-02-14 Thread snowcrash+spamassassin

The man page is pretty straightforward IMO.


sigh.

ok.

as it's clear to one of the developers (!), it _must_ just be me, then. ;-)


 do i need to change it to not 'lose' any capability?

it depends on the channels you were using.  it doesn't change anything
for the official SA channel.  YMMV for third-party channels.  imo,
don't worry about it right now.

snip

Hope this clarifies some more. :)


yes, it does clarify the what?, nicely. thanks!

now, the for which?  is there a wiki page, or some commentay here on
list (yet?), from others/all as to which/what to 'trust' -- or more
importantly, *not* trust?

given that SA's scoring is all about building trust, and, at least at
the beginning, accepting the community's recommendations for default
scoring/trust, i'm curious, then, as to recommendations _here_.

e.g., _i_ currently run cron jobs that regularly exec,

sa-update --channelfile .../DIST-channels.conf
sa-update --channelfile .../SARE-channels.conf

where,

cat .../DIST-channels.conf
updates.spamassassin.org

and

cat .../SARE-channels.conf
70_sare_obfu.cf.sare.sa-update.dostech.net
72_sare_redirect_post3.0.0.cf.sare.sa-update.dostech.net
70_sare_evilnum0.cf.sare.sa-update.dostech.net
70_sare_evilnum1.cf.sare.sa-update.dostech.net
70_sare_bayes_poison_nxm.cf.sare.sa-update.dostech.net
70_sare_header.cf.sare.sa-update.dostech.net
70_sare_header_eng.cf.sare.sa-update.dostech.net
99_sare_fraud_post25x.cf.sare.sa-update.dostech.net
70_sare_spoof.cf.sare.sa-update.dostech.net
70_sare_random.cf.sare.sa-update.dostech.net
70_sc_top200.cf.sare.sa-update.dostech.net
70_sare_oem.cf.sare.sa-update.dostech.net
70_sare_unsub.cf.sare.sa-update.dostech.net
70_sare_uri.cf.sare.sa-update.dostech.net
70_sare_specific.cf.sare.sa-update.dostech.net
70_sare_oem.cf.sare.sa-update.dostech.net
70_sare_html.cf.sare.sa-update.dostech.net
70_sare_genlsubj.cf.sare.sa-update.dostech.net
70_sare_adult.cf.sare.sa-update.dostech.net
72_sare_bml_post25x.cf.sare.sa-update.dostech.net
70_sare_stocks.cf.sare.sa-update.dostech.net
99_FVGT_Tripwire.cf.sare.sa-update.dostech.net
bogus-virus-warnings.cf.sare.sa-update.dostech.net

since i certainly trust the project, and DOS' contributions, should i
simply mod my cron jobs to,

sa-update --allowplugins --channelfile .../DIST-channels.conf
sa-update --allowplugins --channelfile .../SARE-channels.conf

?

in the first case, its clear to trust ... but in the second (SARE)
case, which channel/author am i actually trusting? DOS, SARE, others?

what do folks here recommend?

thanks.


Re: what does the 'new' --allowupdates option to sa-update do?

2007-02-14 Thread snowcrash+spamassassin

hi,


I would say you should add allowplugins if and only if the following
three conditions hold:


this is a helpful -- but very subjective -- approach.


 1) You trust the channel provider is not malicious


well, as in the case if the Project itself, and DOS, y'all _are_ 'nice
folks', 'n all.  but, beyond that? how to know ... ... ?


 2) You trust that the channel is not going to be compromised by an
outside agent (the GPG check is supposed to prevent that, but it's
always possible to compromise a GPG key)


well, if compromise is possible, then it's always possible ... and,
per your arguments, that trust is never valid.  well, there _are_
shades of trust ...


 3) The channel is known to distribute plugins, and you want to use
these plugins by default without checking them first


is there an check -- with sa-update itself, or other -- to determine
what, if any, plugins are going to be distributed by/at a channel
subscription?

sure, one can subscribe, then dig around in the distro files, but,
imho, that's not the user-friendliest approach.

would be nice to have a check, e.g., sa-update --channelfile 'blah'
--check-plugins, or something like ...

as, honestly, as i write here, without 1st checking, or perhaps
thinking on it a bit, i could NOT tell you whether or not the SARE
channels(s) i'm sa-update'ing do, or do not, install/include plugins.

i just don't know.


Anyways, that's my opinion, though I'm not nearly as familiar with the
update process as Theo is.


appreciated.

again, it's pretty clear that there _are_ options/choices to be
had/made, but the i'm just a user, so what do i do now? sort of
guidance is still -- as already pointed out, apprently just for me ;-)
-- a bit soft.

thanks.


Re: SA-gen'd message report headers appear differently (with/without linebreaks) in different mail clients

2007-02-06 Thread snowcrash+spamassassin

 bottom line -- SA works perfectly; tbird's display of SA headers is shoddy.

Actually: If SA's header does not have encoded newlines in it,
SAs header is shoddy (or, more likely, SA's header is formatted
to look nice when viewing the message source) and TB (as well as
other mail readers) displays it correctly.


that's the point ... 'other' clients display it 'properly'.

_only_ tbird does not.


If you want the behaviour of Thunderbird to change, it is
possible that a feature request to have the option to display
headers the way you want might be more fruitful than a request to
fix a non-existant bug. :-)


though i'd like tbird to behave like other clients ... and format
those long-line headers correctly/neatly/as-intended, i've spent too
much time already trying to convince anyone its's an issue, and keep
getting 'debates'.

whatever.  currently, it's unformatted in tbird, unlike in other
clients.  intended, or not; bug, or not -- that's a fact.

it looks shoddy compared to other clients. that's just an opinion ...


But first, check if there's any hidden preference that allready
does what you want. After all, the Mozilla apps are full of
preferences you can only add or change through the Config Editor.


already did. myself, and questioning others who know much more abt the
internals.

short answer (so far) -- there's no such option.

thanks.


Re: SA-gen'd message report headers appear differently (with/without linebreaks) in different mail clients

2007-02-05 Thread snowcrash+spamassassin

 From your screen shot, I'm guessing you're looking at it via
View-Headers-All.


actually, in any/all header 'views' ...


You can see the original formatting (even in
Thunderbird 2) using the Message Source function instead.


yup, aware of that.  that's not the issue though ... rather, it's the
'mis'-display of the header in the header-views.  bug's been reported
to mozilla; 'discussion' there about is it a bug, even though every
OTHER client handles it correctly ... etc etc

honetsly, if 'they' don't want to fix it -- or even ackonwledge it --
there are alternative (like u say, MessageSource, or better yet, use a
different mua/client).

bottom line -- SA works perfectly; tbird's display of SA headers is shoddy.

oh well.

thanks.


Re: SA-gen'd message report headers appear differently (with/without linebreaks) in different mail clients

2007-02-04 Thread snowcrash+spamassassin

Is that the OS X version?


yes, it is.


Plus what version of t-bird are you using?


Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.1.2pre)
Gecko/20070203 BonEcho/2.0.0.2pre


I
use the linux version and mine has a lot more under that option than
yours is showing. Go figure.


:-/


Re: SA-gen'd message report headers appear differently (with/without linebreaks) in different mail clients

2007-02-04 Thread snowcrash+spamassassin

version 1.5.0.9 (20070104) is what I use. I do build it from source but
that shouldn't make any difference.


i've never successfully managed a build of anything-mozilla.  not that
it's a priority ...


I try to avoid pre-releases for
things such as T-Bird/Firefox and am not sure you could actually revert
back to the 1.5.0.9 stable version.


FF2  Tbird2 have been, generally, far more stable for me than their
15x counterparts ... but, yes, it's a beta.


Looks like there are some options gone missing in the 2.x branch. ::sigh::


alas, yes.


SA-gen'd message report headers appear differently (with/without linebreaks) in different mail clients

2007-02-03 Thread snowcrash+spamassassin

hi,

when i receive a message that's passed through SpamAssassin,  if i
view the Message Source in any client, i see a correctly/expected
formatted report, e.g:

--
X-Spam-Checker-Version: SpamAssassin 3.1.8-r454679 (2006-10-10)
X-Spam-Level: !
X-Spam-Status: score=17.5/4.0 autolearn=spam
X-Spam-Report:
  *  1.1 EXTRA_MPART_TYPE Header has extraneous Content-type:...type= entry
  *  0.0 DK_POLICY_SIGNSOME Domain Keys: policy says domain signs
some mails
  *  5.0 BOTNET Relay might be a spambot or virusbot
  *  
[botnet0.7,ip=208.103.1.19,hostname=208.103.0.19.etczone.com,baddns,client,ipinhostname]
  *  0.1 TW_CX BODY: Odd Letter Triples with CX
  *  0.1 TW_GW BODY: Odd Letter Triples with GW
  *  0.1 TW_MK BODY: Odd Letter Triples with MK
  *  0.1 TW_BJ BODY: Odd Letter Triples with BJ
  *  0.1 TW_JM BODY: Odd Letter Triples with JM
  *  0.1 TW_UW BODY: Odd Letter Triples with UW
  *  0.1 TW_PW BODY: Odd Letter Triples with PW
  *  0.1 TW_IU BODY: Odd Letter Triples with IU
  *  0.1 TW_YJ BODY: Odd Letter Triples with YJ
  *  0.1 TW_DB BODY: Odd Letter Triples with DB
  *  0.0 HTML_MESSAGE BODY: HTML included in message
  *  3.1 HTML_IMAGE_ONLY_08 BODY: HTML: images with 400-800 bytes of words
  *  1.5 BAYES_50 BODY: Bayesian spam probability is 40 to 60%
  *  [score: 0.5531]
  *  6.0 FUZZY_OCR BODY: Img with common spam text inside
  *  Words found:
  *  cialis in 1 lines
  *  viagra in 1 lines
  *  cialis in 1 lines
  *  viagra in 1 lines
  *  (4 word occurrences found)
--

if i open the message in, e.g. Mulberry, and view 'all' headers, i see a
similarly formatted:

--
X-Spam-Level: !
X-Spam-Status: score=17.5/4.0 autolearn=spam
X-Spam-Report:
  *  1.1 EXTRA_MPART_TYPE Header has extraneous Content-type:...type= entry
  *  0.0 DK_POLICY_SIGNSOME Domain Keys: policy says domain signs
some mails
  *  5.0 BOTNET Relay might be a spambot or virusbot
  *  
[botnet0.7,ip=208.103.1.19,hostname=208.103.0.19.etczone.com,baddns,client,ipinhostname]
  *  0.1 TW_CX BODY: Odd Letter Triples with CX
  *  0.1 TW_GW BODY: Odd Letter Triples with GW
  *  0.1 TW_MK BODY: Odd Letter Triples with MK
  *  0.1 TW_BJ BODY: Odd Letter Triples with BJ
  *  0.1 TW_JM BODY: Odd Letter Triples with JM
  *  0.1 TW_UW BODY: Odd Letter Triples with UW
  *  0.1 TW_PW BODY: Odd Letter Triples with PW
  *  0.1 TW_IU BODY: Odd Letter Triples with IU
  *  0.1 TW_YJ BODY: Odd Letter Triples with YJ
  *  0.1 TW_DB BODY: Odd Letter Triples with DB
  *  0.0 HTML_MESSAGE BODY: HTML included in message
  *  3.1 HTML_IMAGE_ONLY_08 BODY: HTML: images with 400-800 bytes of words
  *  1.5 BAYES_50 BODY: Bayesian spam probability is 40 to 60%
  *  [score: 0.5531]
  *  6.0 FUZZY_OCR BODY: Img with common spam text inside
  *  Words found:
  *  cialis in 1 lines
  *  viagra in 1 lines
  *  cialis in 1 lines
  *  viagra in 1 lines
  *  (4 word occurrences found)
--

BUT, if i open the message in Thunderbird2, the line-breaks in the
header are apparently stripped off; here's what it looks like.

  http://img100.imageshack.us/img100/278/mnenhyallheaderswh1.jpg

In troubleshooting this, i was informed about the Mozilla MailNews
backend, that TBird is using,

 As per RfC (2)822, header _values_ are always just *one* line.
 To get around the (server) restriction of 998 usable characters per
 line, it is allowed to split the value into multiple lines. But these
 line breaks are *not* part of the actual value and recipients have to
 remove the line breaks when decoding the message to get back the real
 value. If the the value should contain line breaks, these have to be
 encoded before, eg. as =0A in the Quoted Printable encoding.

 The X-Spam-Result header value is not encoded, thus the line breaks used
 as a formatting in the source are *not* part of the value and *must* be
 stripped before passing the value to the frontend.

 The MailNews backend handling is correct.

Since this is the same message, retrieved from the same mail server,
and, therefore, having been processed by the same instance of SA, i'm
guessing this has to do with what the SA report-generating step does.
But, i'm not certain of that ...

That said, can someone chime in here, and perhaps suggest where to
look / what to do about this?

thanks.


Re: SA-gen'd message report headers appear differently (with/without linebreaks) in different mail clients

2007-02-03 Thread snowcrash+spamassassin

There is nothing SpamAssassin related here.  The information in the header is
written w/ whitespace folding.  Most MUAs leave it alone when showing it to
you, Thunderbird apparently unfolds the lines.

You may have an option which lets you disable it, but it's 100% a mail client
issue.


ok.

so far, googling, i haven't even found mention of the issue/problem,
let alone a fix/option to turn it off. :-/

if anyone _here_ sees the problem in tbird, i'd appreciate hearing about it.

thanks.


Re: SA-gen'd message report headers appear differently (with/without linebreaks) in different mail clients

2007-02-03 Thread snowcrash+spamassassin

In T-Bird under preferences-Display under the Formatting tab. wrap
test to fit window width. I believe it is checked by default.


hm. don't have one of those,

 http://img401.imageshack.us/img401/8435/tbirdtabuj4.png

haven't found it elsewhere either (yet ...) :-/


spamhaus' PBL is now *active* (in beta ... but still active). now what?

2007-01-06 Thread snowcrash+spamassassin

reading at the spamhaus site abt PBL i note,

WARNING! Some post-delivery filters use full Received line
traversal or deep parsing, where the filter reads all the IPs in
the Received lines. Legitimate users, correctly sending good mail out
through their ISP's smarthost, will have PBL-listed IPs show up in the
first (lowest) Received header where their ISP picks it up. Such mail
should not be blocked! So, you should tell your filters to stop
comparing IPs against PBL at the IP which hands off to your mail
server! That last hand-off IP is the one which PBL is designed to
check. If you cannot configure your filters that way, then do not use
PBL to filter your mail.

with the ever-smarter filters available with SA  SARE etc, what -- if
anything -- should 'we' do/configure differently in SA's confs/ops to
avoid this issue?

thanks.


Re: spamhaus' PBL is now *active* (in beta ... but still active). now what?

2007-01-06 Thread snowcrash+spamassassin

wow dude, that's quick -- I hear it went live only a few hours
ago ;)


i've waited long with baited breath for
[EMAIL PROTECTED] et. al. to leave me the fsck
alone :-)


As long as trusted_networks and internal_networks are configured
correctly


correctly ?!

oh heck ... here we go again! ;-)


Re: spamhaus' PBL is now *active* (in beta ... but still active). now what?

2007-01-06 Thread snowcrash+spamassassin

That would be the case if the PBL rule looked like:

  header RCVD_IN_PBL  eval:check_rbl('zen', 'zen.spamhaus.org.', 
'127.0.0.1[01]')

instead of

  header RCVD_IN_PBL  eval:check_rbl('zen-notfirsthop', 
'zen.spamhaus.org.', '127.0.0.1[01]')


grep'ing in my dist files  rules, there's no trace of 'zen' (well,
except for the polish and dutch cf's ... those folks need more vowels!
:-) ) or 'notfirsthop', so, to my first question ...

is there something we normal, non-sa-godlike humans need to do to
distro (sa, sare, etc) files? or *just* make said mod in our local.cf
rules?


thanks.


Re: spamhaus' PBL is now *active* (in beta ... but still active). now what?

2007-01-06 Thread snowcrash+spamassassin

run sa-update.


i regularly run updates via cron on the hour.

running it again, or at all, will change what/where?

again, i see no traces of zen/pbl anywhere other than in my local.cf, atm.

i'm asking what *specifically* needs to change, if anything, in SA ...
i'd prefer NOT to be blind about it.

thanks.


Re: spamhaus' PBL is now *active* (in beta ... but still active). now what?

2007-01-06 Thread snowcrash+spamassassin

Specifically, nothing.  The updates already include it:

updates_spamassassin_org/20_dnsbl_tests.cf:header __RCVD_IN_ZEN
eval:check_rbl('zen', 'zen.spamhaus.org.')
updates_spamassassin_org/20_dnsbl_tests.cf:header RCVD_IN_XBL
eval:check_rbl('zen-lastexternal', 'zen.spamhaus.org.', '127.0.0.[456]')
updates_spamassassin_org/20_dnsbl_tests.cf:header RCVD_IN_PBL
eval:check_rbl('zen-lastexternal', 'zen.spamhaus.org.', '127.0.0.1[01]')


and, that's it,

% grep PBL Dist/* | grep RCVD
%
% grep ZEN Updates/3.001008/updates_spamassassin_org/* | grep RCVD
Updates/3.001008/updates_spamassassin_org/20_dnsbl_tests.cf:header
__RCVD_IN_ZENeval:check_rbl('zen', 'zen.spamhaus.org.')
Updates/3.001008/updates_spamassassin_org/20_dnsbl_tests.cf:describe
__RCVD_IN_ZEN  Received via a relay in Spamhaus ZEN
Updates/3.001008/updates_spamassassin_org/20_dnsbl_tests.cf:tflags
__RCVD_IN_ZENnet
% grep PBL Updates/3.001008/updates_spamassassin_org/* | grep RCVD
Updates/3.001008/updates_spamassassin_org/20_dnsbl_tests.cf:header
RCVD_IN_PBL  eval:check_rbl('zen-lastexternal',
'zen.spamhaus.org.', '127.0.0.1[01]')
Updates/3.001008/updates_spamassassin_org/20_dnsbl_tests.cf:describe
RCVD_IN_PBLReceived via a relay in Spamhaus PBL
Updates/3.001008/updates_spamassassin_org/20_dnsbl_tests.cf:tflags
RCVD_IN_PBL  net
Updates/3.001008/updates_spamassassin_org/20_dnsbl_tests.cf:#reuse 
RCVD_IN_PBL
Updates/3.001008/updates_spamassassin_org/30_text_de.cf:lang de
describe RCVD_IN_PBL Transportiert via Rechner in PBL-Liste
(http://www.spamhaus.org/pbl/)
Updates/3.001008/updates_spamassassin_org/30_text_nl.cf:lang nl
describe RCVD_IN_PBL Ontvangen via een relay die
gevonden is in Spamhaus PBL
Updates/3.001008/updates_spamassassin_org/50_scores.cf:score
RCVD_IN_PBL 0 0.001 0 0.001

now, given John Rudd's comment of

 ah, I didnt' know about notfirsthop.  That addresses it completely.

i still see no instance,

% grep -rlni notfirsthop Updates/
%

is notfirsthop *necessary*, or just the _right_way_ for that specific example?

thanks much.


Re: spamhaus' PBL is now *active* (in beta ... but still active). now what?

2007-01-06 Thread snowcrash+spamassassin

In any case, why the fuss?  You've had three SA developers tell you the
rules that are published are fine how they are.


wow.

what fuss ? i've been polite in my intent and in my asking.  this
*is* the users list after all.

i'm asking questions so that i understand. contrary to what you may
believe, i actually read those comments several times and still wasn't
clear.  lastexternal had *not* been mentioned, notfirsthop *had*.

it may come as a surpirse to you, but just cause you 'say' it, doens't
mean that we all understand it.

i'm sorry if you find me too dense for your standards; i'll make sure
not to bother _you_ further.


Re: spamhaus' PBL is now *active* (in beta ... but still active). now what?

2007-01-06 Thread snowcrash+spamassassin

The recent 3.1 updates include the ZEN rules.  If you're asking what files are
changed by sa-update, please see man sa-update and the other documentation
referenced therein.


no, i was asking what files need to be changed in order for the
referenced 'warning' abt PBL usages w/ filtering/scanning apps -- such
as SA -- to NOT be a problem.

as in the following post by Phil, i had read this, and the exchange
bet jw  john, as indicating that we, at least, need to ensure -- and
possible change something -- that something's done a particular way.


Nothing needs to be changed, the update has everything necessary.


again, that was NOT clear.  it was made MORE unclear by their exchange.

anyway, from my perspective, i've now followed advice, simply run
sa-update, enusred it --lint'ed correctly as usual, and expect that
this will _not_ be a problem.

thanks.

p.s. ($1 says that there _will_ be others that ask, again, in the future ...)


sa + bayes/sqlite _performance_? reasons _not_ to use it?

2006-12-31 Thread snowcrash+spamassassin

i'm interested in using sqlite across my 'entire' mail server env.
currently, exim+dovecot+spamassassin.

i know sqlite _can_ be used for bayes db in sa.  lots of info on that.

any reasons it should NOT be used?

i'm guessing performance, compared to dbm, might be an issue, but
other than a comment in sql/README.bayes:

NOTE: You may
find that some implementations do not provide a significant advantage
over using the default DBM implementation.

i have not found a performance comparison -- QUESTIONS about it, yes.
but no ANSWERS (yet).

any references, info, comments?

thanks  happy new year.


Re: sa + bayes/sqlite _performance_? reasons _not_ to use it?

2006-12-31 Thread snowcrash+spamassassin

There are no published performance numbers for using SQLite because it
is so slow I gave up the tests, deciding it was not even worth the
effort.  When I say slow, I mean 15+ hrs to do what even the basic SQL
storage module on MySQL on MySQL could do in  5 mins.


15+ hours vs 5 minutes ??!!

i don't know the details of what the what is, but that's asounding.



This is most likely because a custom storage module for SQLite is
needed, some have pointed this out.

Probably not the answer you wanted, but thats about all there is.


if it's a factual answer that's saved me time/effort, i'm very
appreciative!  thanks.

i think, then, i'll stick to dbm for spamassassin, and
consider/evaluate use of sqlite for simple/short lookups in
dovecot+exim. and/or wonder why bother ...


Re: --lint test fails

2006-12-29 Thread snowcrash+spamassassin

In running a lint test on one of my boxes I get the following error which I 
can't seem
to figure out why. Pyzor is installed and the path is correct:

[3075] warn: config: failed to parse line, skipping: pyzor_add_header 1
[3075] warn: lint: 1 issues detected, please rerun with debug enabled for more
information


assuming you're running a recent 31x ver of SA, that cmd is no longer
the way to enable pyzor ...

rather, this

   loadplugin Mail::SpamAssassin::Plugin::Pyzor

is added to init.pre.


Re: sa-learn explained

2006-12-29 Thread snowcrash+spamassassin

and this,

http://www.spamhaus.org/zen

Caution: zen.spamhaus.org replaces sbl-xbl.spamhaus.org.

If you are currently using sbl-xbl.spamhaus.org you can now replace
'sbl-xbl' with 'zen' (sbl-xbl.spamhaus.org will eventually become
obsolete and may in the future be withdrawn from service).

zen.spamhaus.org should now be the only spamhaus.org DNSBL in your
configuration. You should not use ZEN together with other Spamhaus
blocklists or you will simply be wasting DNS queries and slowing your
mail queue.


Re: sa-learn explained

2006-12-29 Thread snowcrash+spamassassin

Perhaps it's not ready for prime time. I can't imagine that if it was they
would not be making it headline news.


linford has, apparently, stated in posts to newgroups that folks
should switch _now_. i think there's a reference in this list's
archive, iirc.

public announcements, i'd guess, will be made when all t's are crossed etc etc


Re: Error in FuzzyOcr 3.5.x branch

2006-12-27 Thread snowcrash+spamassassin

hi,

some of this is familiar ...


I use FuzzyOcr 3.5.x branch.


are you, in fact using the SVN branch? or building from the 'release' tarballs?

my suggestion is stick to the tarballs, for now.

and, if you are using the tarballs, have you applied all 3 patches?

iirc, the untie errors i'd seen were dealt with in patch #1.

just guessing ...


Re: Error in FuzzyOcr 3.5.x branch

2006-12-27 Thread snowcrash+spamassassin

He meant that you should use:

  http://fuzzyocr.own-hero.net/wiki/Downloads


yes, that's corrrect.

to be clear, download,

 http://users.own-hero.net/~decoder/fuzzyocr/fuzzyocr-3.5.0-rc1.tar.gz
 http://users.own-hero.net/~decoder/fuzzyocr/patchset1.patch
 http://users.own-hero.net/~decoder/fuzzyocr/patchset2.patch
 http://users.own-hero.net/~decoder/fuzzyocr/patchset3.patch

and, per instructions on the DL page, apply,

 patch -p0  .../patchset1.patch
 patch -p0  .../patchset2.patch
 patch -p0  .../patchset3.patch


and leave the code in development alone (unless you are a developer ;^).


well, i didn't actually suggest THAT, but it's probly good advice ;-)


--lint reports failed to parse line, skipping: _ _2_ _R_TEXT_; can't find the problem

2006-12-21 Thread snowcrash+spamassassin

i have installed,

 spamassassin --version
SpamAssassin version 3.1.8-r454679
  running on Perl version 5.8.8

after a recent sa-update, --lint returns,

[7718] dbg: plugin: fixed relative path:
/etc/mail/spamassassin/updates/3.001008/updates_spamassassin_org/80_additional.cf
[7718] dbg: config: using
/etc/mail/spamassassin/updates/3.001008/updates_spamassassin_org/80_additional.cf
for included file
[7718] dbg: config: read file
/etc/mail/spamassassin/updates/3.001008/updates_spamassassin_org/80_additional.cf
[7718] warn: config: failed to parse line, skipping: _ _2_ _R_TEXT_

but,

 cd /etc/mail/spamassassin/updates/3.001008/updates_spamassassin_org
 ls 80_additional.cf
80_additional.cf
 grep -i R_TEXT 80_additional.cf


and, even,

 cd /etc/mail/spamassassin
 grep -rlni R_TEXT .


suggestions as to what/where the issue may be?

thanks.


Re: --lint reports failed to parse line, skipping: _ _2_ _R_TEXT_; can't find the problem

2006-12-21 Thread snowcrash+spamassassin

   [7718] warn: config: failed to parse line, skipping: _ _2_ _R_TEXT_

No idea what that's about.  The underscores could be other random/non-print
chars btw.


it turns out that there was a hidden file in the same directory as my
local.cf named,

._local.cf

filled with garbage, which the --lint was loading/reading.

deleting it makes the --lint failure go away.

i have NO idea where that came from.  _i_ most certainly did not create it.

i never would have thought to look for a _hidden_ *.cf file; i
stumbled across it.

could we, perhaps, have an option (at ./configure time, maybe?) to
TURN OFF loading of hidden .cf's?

thanks.


Re: SpamdForkScaling messages?

2006-12-14 Thread snowcrash+spamassassin

They're debug messages -- not a problem at all.


great. i can ignore them. :-)

does it matter at all that those message have DISappeared after
switching from sa-via-TCP-sock to sa-via-UNIX-sock?


SpamdForkScaling messages?

2006-12-13 Thread snowcrash+spamassassin

i have

spamassassin --version
SpamAssassin version 3.1.8-r454679
  running on Perl version 5.8.8

in my debug-level spamd log i see frequently repeating instances of,

Wed Dec 13 18:36:13 2006 [923] dbg: prefork: periodic ping from spamd parent
Wed Dec 13 18:36:13 2006 [923] dbg: prefork: sysread(9) not ready,
wait max 300 secs
Wed Dec 13 18:36:13 2006 [923] dbg: prefork: periodic ping from spamd parent
Wed Dec 13 18:36:13 2006 [923] dbg: prefork: sysread(9) not ready,
wait max 300 secs
...

grep'ing in src, i note that these errors originate in,

SpamdForkScaling.pm

afaict, there's no, manpage available for Mail::SpamAssassin::SpamdForkScaling

searching on the website, i find links to the .pm src.

both TITLE  FULLTEXT searches on the wiki come up empty.

what is SpamdForkScaling? are there docs?
are these not ready messages a problem?
if so, wht do i do about them?


any TextWrapError follow-up?

2006-12-13 Thread snowcrash+spamassassin

i've come across this issue,

 http://wiki.apache.org/spamassassin/TextWrapError

where it's noted that the bug was reported to the TextWrap author.

is this being followed up one at all by anyone here?

anyone have a bug reference for the issue @ TextWrap?

thanks.


Re: any TextWrapError follow-up?

2006-12-13 Thread snowcrash+spamassassin

I'd say make sure you have something newer than that and try it again.  If you
still have problems, please reopen bug 5052 w/ the Text::Wrap and SA versions.


yup. too old.

i'm co'ing current @ (Revision: 486953) which should do the trick.

thanks.


spamd won't stay dead. (possible follow-up to bug 4304?)

2006-12-13 Thread snowcrash+spamassassin

after launching spamd (31x branch, r486953) with,

spamd --daemonize --nouser-config --allow-tell
--allowed-ips=192.168.1.10,127.0.0.1 --listen-ip=127.0.0.1 --port=783
/dev/null  /var/log/spamd.log 

i see only,

ps -ax | grep -i spamd
  922  ??  S  0:00.18 spamd child
  923  ??  S  0:00.14 spamd child
24006  p1  R+ 0:00.01 grep -i spamd

if i want to stop/restart spamd,

kill 922 923

kills the two child processes, which then immediately restart.

iirc, this,

kill -HUP `ps -ax | grep \? | grep bin/spamd | cut -c1-5`

used to work because i actually saw a kill-able spamd master process.

how do i kill spamd and keep it dead?

this bug,

http://issues.apache.org/SpamAssassin/show_bug.cgi?id=4304

seems related, and seomthing (a fix?) WAS committed to r485842, but
i'm still seeing this problem with my version.

was this commit a fix?
was it to TRUNK or bracnh 31x?


Re: spamd won't stay dead. (possible follow-up to bug 4304?)

2006-12-13 Thread snowcrash+spamassassin

Hrm.  There's no parent in that output.

Try ps -ef | grep spamd and see what happens.


not sure what you're looking for here, but,

 % ps -ef | grep spamd
   ps: illegal option -- f


 kills the two child processes, which then immediately restart.

Yeah, you need to deal with the parent, not the children.


hm.

if i replace,

--allowed-ips=192.168.1.10,127.0.0.1 --listen-ip=127.0.0.1 --port=783

with,

--socketpath /var/proc/spamd.sock --socketowner spamassassin
--socketgroup spamassassin --socketmode 0664

and then relaunch spamd,

checking, i see, now,

ps -ax | grep -i spamd
24190  ??  Ss 0:11.55 /usr/bin/perl -T -w
/usr/local/spamassassin/bin/spamd ... --allow-tell --daemonize
--min-children
24213  ??  S  0:00.07 spamd child
24214  ??  S  0:00.07 spamd child
24216  p2  S+ 0:00.00 grep -i spamd

so the master is here now.

this 'missing master' is, for me, reproducible when spamd is launched
using TCP sockets.

anyway, now, with spamd launch on UNIX sock,

kill `ps -ax | grep \? | grep bin/spamd | cut -c1-5`

works as expected.

as i use spamd on localhost to my MTA, UNIX sock is just fine.

are there any DISadvantages to launching spamd on a UNIX socket, vs a
TCP socket?


Re: spamd won't stay dead. (possible follow-up to bug 4304?)

2006-12-13 Thread snowcrash+spamassassin

  % ps -ef | grep spamd
ps: illegal option -- f

Hrm.  What platform are you on?  ps -axwg, ps -el ?


well, what day/time is it?

at the moment, MacOSXServer.  during the day, usually an OpenSuSE or
FreeBSD box.


Anyway, you could also look at using a pid file.  Tell spamd when starting -r
/path/to/pidfile, and then you can use that to figure out who the parent is.


i had actually been specifying the PID just didn't include it in my
post), and had not thought of that.

but still, the master was missing.

anyway, i've switched to unix sockets for now -- goood enough.

and, i *see* the master with a simple ps -ax ... which is what i
remember the behavior as being b4.

also, with the unix sockets, the other issue i posted abt, repeating
messages of,

Wed Dec 13 18:36:13 2006 [923] dbg: prefork: periodic ping from spamd parent
Wed Dec 13 18:36:13 2006 [923] dbg: prefork: sysread(9) not ready,
wait max 300 secs

seems to have been resolved as well.


Re: some scores (fuzzyocr, spf, tvd_fw_graphic) missing in normal submission; OK in manual resubmit

2006-12-12 Thread snowcrash+spamassassin

that is hard to tell, can you reproduce the error somehow? (i.e.
reproduce the situation where FuzzyOcr did NOT score?).


well, there lies the challenge -- and the point, i guess -- *i* can't
reproduce the non-scoring.  every test i run scores OK.


If so, enable
debugging to the logfile to see whats going on exactly :)


forgot abt the separate log :-/

i cranked logging verbosity from 1-3; and will keep an eye out for
next non-scoring message.

but, i *did* notice in me level 1 log,

2006-12-12 11:42:01 [3314] gifsicle is already defined, skipping...
2006-12-12 11:42:01 [3314] giffix is already defined, skipping...
2006-12-12 11:42:01 [3314] giftext is already defined, skipping...
2006-12-12 11:42:02 [3314] gifinter is already defined, skipping...
2006-12-12 11:42:02 [3314] giftopnm is already defined, skipping...
2006-12-12 11:42:02 [3314] jpegtopnm is already defined, skipping...
2006-12-12 11:42:02 [3314] pngtopnm is already defined, skipping...
2006-12-12 11:42:02 [3314] bmptopnm is already defined, skipping...
2006-12-12 11:42:02 [3314] tifftopnm is already defined, skipping...
2006-12-12 11:42:02 [3314] ppmhist is already defined, skipping...
2006-12-12 11:42:02 [3314] gocr is already defined, skipping...
2006-12-12 11:42:02 [3314] ocrad is already defined, skipping...
2006-12-12 11:42:02 [3314] pnmnorm is already defined, skipping...
2006-12-12 11:42:02 [3314] pnminvert is already defined, skipping...
2006-12-12 11:42:02 [3314] convert is already defined, skipping...
2006-12-12 11:42:02 [3314] pamthreshold is already defined, skipping...
2006-12-12 11:42:02 [3314] ppmtopgm is already defined, skipping...
2006-12-12 11:42:02 [3314] pamtopnm is already defined, skipping...
2006-12-12 11:42:02 [3314] Error, label already used earlier in line
170, aborting...
2006-12-12 11:42:02 [3314] Error parsing preprocessor file
/etc/mail/spamassasson/FuzzyOcr.preps, aborting...

don't know if this is a problem yet ...


Re: some scores (fuzzyocr, spf, tvd_fw_graphic) missing in normal submission; OK in manual resubmit

2006-12-12 Thread snowcrash+spamassassin

also, if i extract the .gif from the spam, attach to a new message and
mail that to myself, it scores/reports. correctly with all -- fuzzyocr
 others -- test.

hm ...


should no-autolearned, but highly-scored blabby spam be leanred?

2006-12-11 Thread snowcrash+spamassassin

i noted in a recent thread a suggestion to not feed bayes-poisoning
spam to sa-learn.

that's an interesting thought; and actually makes some initial sense to me.

is this, in fact, widely suggested/recommended?

e.g., if i have a blabby, bayes-poisoning spam that already scores high,

X-Spam-Status: score=11.5/4.0 autolearn=no
X-Spam-Report:
*  2.0 RELAY_FR Relayed through France
*  1.1 EXTRA_MPART_TYPE Header has extraneous Content-type:...type= 
entry
*  0.0 DK_POLICY_SIGNSOME Domain Keys: policy says domain signs some 
mails
*  0.0 BOTNET_NORDNS IP address has no PTR record
*  0.0 HTML_MESSAGE BODY: HTML included in message
*  1.5 BAYES_50 BODY: Bayesian spam probability is 40 to 60%
*  [score: 0.5000]
*  1.2 SARE_GIF_ATTACH FULL: Email has a inline gif
*  0.7 MY_CID_AND_STYLE SARE cid and style
*  5.0 BOTNET The submitting mail server looks like part of a Botnet

should this be submitted to sa-learn? or simply discarded?

thanks.


just wanting to say thanks!

2006-12-06 Thread snowcrash+spamassassin

i've installed spamassassin 318 branch with 'botnet', 'imageinfo' 
'fuzzyocr' plugins.

i stay regularly updated via sa-update with distro  SARE rules.

i've got a well-trained bayes system.

my servers see ~ 4-5K messages a day; yes, tiny volume by many standards.

i admit to 'cheating' by depending heavily on zen.spamhaus.org DNSbl@
SMTP negotiation, and ruthlessly blocking China/Korea at the routers.

over the last month or so, i've been (finally) managing a proper
quarantine and monitoring stats.

fwiw, in ~130K messages, i've seen,

0 false positives
0 false negatives

that's certainly batting a thousand in my book. and, yes, YMMV and
i'm probly just 'lucky' this month.

regardless, it's long past time to simply say,

THANKS!


Re: Installed FuzzyOCR - What am I missing?

2006-11-28 Thread snowcrash+spamassassin

spamassassin  animated-gif.eml  out

out shows no FuzzyOCR hits.

Am I missing something obvious?


when *i* first ran tests, i'd set:

focr_autodisable_score 10

the score hit 10 too soon ... and fuzzy ocr didn't run/score any hits.

set it 'high', e.g.,

focr_autodisable_score 999

then try again

worked for me.

hth.


Re: blarsbl

2006-11-21 Thread snowcrash+spamassassin

[EMAIL PROTECTED]: host gateway.mchsi.com[204.127.203.150] said:
 550-12.175.23.161 blocked by ldap:ou=rblmx,dc=mso,dc=att,dc=net
550 Blocked
 for abuse. Please contact the administrator of your ISP or sending
 mailservice. (in reply to MAIL FROM command)


aha. the mchsi-variant of att. i seem to keep bumping into these guys
re: questionable emails/policies.

thanks for the info!


Re: blarsbl

2006-11-21 Thread snowcrash+spamassassin

On 11/21/06, Thomas Lindell [EMAIL PROTECTED] wrote:

Att mail servers use his service.


can you please share/point-to some evidence of that fact?  if that
*is* the case, i'll be chatting with my reps at att!

if i've missed it here, i apologize in advance ...


thanks.


Re: I've got TORA.08 spelled with numbers?

2006-11-17 Thread snowcrash+spamassassin

 I'm getting a bunch of spams this morning that have
 TORA.08 spelled out with numbers like this.


lordy, lordy!

i'm just *SURE* i'm missing the whole point of this sort of spam ...

... but WHY do these spammers even bother with this sort of stuff?

even if it *does* temporarily get past filters -- who in their right
mind clicks on this stuff?  or, worse, would send/invest $$$?


FuzzyOcr failing 'png' tests

2006-11-17 Thread snowcrash+spamassassin

(seems like the 'action' is over here ...)

i'm running SA v3.1.8-r454679, with the FuzzyOCR v3.4.2-release

$SA --lint is error-free.

testing the plugin with provided test messages,

$SA -t -x  /tmp/ocr-gif.eml
$SA -t -x  /tmp/ocr-jpg.eml
$SA -t -x  /dev/FuzzyOcr-3.4.2/samples/animated-gif.eml
$SA -t -x  /dev/FuzzyOcr-3.4.2/samples/corrupted-gif.eml
$SA -t -x  /dev/FuzzyOcr-3.4.2/samples/jpeg.eml
$SA -t -x  /dev/FuzzyOcr-3.4.2/samples/ocr-animated.eml

all show hits/scores with FuzzyOCR rules, as expected.

but,

$SA -t -x  /tmp/ocr-png.eml
$SA -t -x  /dev/FuzzyOcr-3.4.2/samples/png.eml

both complete without apparent error, and score numerous other SA-rule hits, but
no FuzzyOCR scores at all.

i have verified that i'm not auto-disabling FuzzyOcr,

 grep focr_autodisable_score FuzzyOcr.cf
   focr_autodisable_score 999

and, since a number of examples seem to be scoring properly, i'm
guessing either FuzzyOcr itself or my config have a problem.

1st question -- can anyone verify success/failure of those png
examples with their own SA+FuzzyOcr setup?

thanks.


Re: Rules Du Jour briken?

2006-11-16 Thread snowcrash+spamassassin

  Actually, the whole exit0.us site doesnt work.

 Its been down for almost 2 weeks. I thought it would come back up,
 but it may be gone for good :(

Then what do we do for rule updates?


my understanding is that all (most?) rules are available by sa-update,
as an alternative/interim solution if you like.


Re: Rules Du Jour briken?

2006-11-16 Thread snowcrash+spamassassin

sa-update isn't included if we're running Debian Sarge on our mail
server.  (SA version 3.0.3)  But thanks.


sorry, didn't realize this wasn't a build from src :-/

(serves me right for not reading the full thread ...)


fyi: spamhaus' SBL-XBL dnsbl being replaced by ZEN

2006-11-15 Thread snowcrash+spamassassin

http://www.spamhaus.org/zen/

steve linford of spamhaus has recommended that people switch now:

 Is there any reason not to change?

None, I advise everyone to change now.

The SBL-XBL zone will continue to exist for some time but will not of
course contain the new PBL DNSBL and will not contain other future
DNSBLs we may release. ZEN is designed to be safely hard-coded into spam
filter appliances and commercial filters.

i presume this will have effects on the SBL-  XBL- related rules here.


Re: fyi: spamhaus' SBL-XBL dnsbl being replaced by ZEN

2006-11-15 Thread snowcrash+spamassassin

 i presume this will have effects on the SBL-  XBL- related rules here.

probably nothing too serious though ;)


just some renaming, i'd guess.


Where did he mention this, as a matter of interest?


in the n.a.n.a.e. loony-bin, of course. :-)

http://groups-beta.google.com/group/news.admin.net-abuse.email/msg/2d050ab220faf931


Re: fyi: spamhaus' SBL-XBL dnsbl being replaced by ZEN

2006-11-15 Thread snowcrash+spamassassin

 in the n.a.n.a.e. loony-bin, of course. :-)

eek, I'm not reading _that_ ;)


:-D

i kept kill-filing so much of nanae in my reader that finally it was
just easier to killfile *, and whitelist Linford.

he pops up there with some useful info every once in awhile :-)


Re: do imageinfo and fuzzyocr plugins' results overlap?

2006-11-13 Thread snowcrash+spamassassin

I use both here.

In FuzzyOcr.cf, set focr_autodisable_score to the threshold you require.

That way it only scans images if the SA score so far is under the
specified threshold.

It's a lot cheaper to bump up the score using ImageInfo than to do a
couple of OCR scans.


ok, that does make sense, thanks.  i've also just recognized the
'priority' setting in FuzzyOcr that allows me to ensure that other
plugins 'run' before it.

just curious -- what do you typically set that threshhold at? assuming
a 'standard' is_spam thresshold of, say, 5.0.


Re: do imageinfo and fuzzyocr plugins' results overlap?

2006-11-13 Thread snowcrash+spamassassin

We Use MailScanner which has concepts of low- and high- scoring
spam. I set focr_autodisable_score to just above my high spam score
score.

If it's already scored high enough for it to not reach the user's
mailbox, there's no need for FuzzyOcr to do anything.


clear.

thanks!


what are default rule priorities?

2006-11-13 Thread snowcrash+spamassassin

i understand that the fuzzyocr plugin can be set to have a high (900?)
priority, so as to run last.

i assume this priority is a threshhold number relative to other rules'
priorities.

but, what ARE the other rules' priorities?

is there documentation of that? nothing on the wiki that i've found.


fuzzyocr 342 fires error warn, but scores anyway ... does it work?

2006-11-13 Thread snowcrash+spamassassin

i've installed fuzzyocr 3.4.2.

using a sample-file from the trac site,

spamassassin -t -x  ocr-gif.eml

i get an error  a warning:

GIF-LIB error: Failed to Read from given file.
[13690] warn: MLDBM error: Second level tie failed, No such file or
directory at /etc/mail/spamassassin/FuzzyOcr.pm line 455
...
Return-Path: [EMAIL PROTECTED]
X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.1.8-r454679
...


but, the message does score:

Content analysis details:   (8.4 points, 4.0 required)

 pts rule name  description
 -- 
--
 2.0 RELAY_TW   Relayed through Taiwan
 0.0 RELAY_DE   Relayed through Germany
 0.0 DK_POLICY_SIGNSOME Domain Keys: policy says domain signs some 
mails
 0.0 HTML_MESSAGE   BODY: HTML included in message
 1.7 RCVD_IN_NJABL_DUL  RBL: NJABL: dialup sender did non-local SMTP

[217.226.209.237 listed in combined.njabl.org]
 0.9 MY_CID_AND_CLOSING SARE cid and closing
 1.5 FUZZY_OCR_WRONG_CTYPE  BODY: Mail contains an image with wrong
content-type set
Image has format 
GIF but content-type is
image/jpeg
 2.5 FUZZY_OCR_CORRUPT_IMG  BODY: Mail contains a corrupted image
Corrupt image: 
GIF-LIB error: Image is
defective, 
decoding aborted.
-0.2 AWLAWL: From: address is in the auto white-list

so, given the error+warn, did/didn't, fuzzyocr work as it should here?


Re: what are default rule priorities?

2006-11-13 Thread snowcrash+spamassassin

check perldoc Mail::SpamAssassin::Conf --

...

The default test priority is 0 (zero).


ok.

i suppose this means that the searchable wiki does NOT include the
docs.  i thought it did.

thanks.


(fixed) Re: fuzzyocr 342 fires error warn, but scores anyway ... does it work?

2006-11-13 Thread snowcrash+spamassassin

GIF-LIB error: Failed to Read from given file.
[13690] warn: MLDBM error: Second level tie failed, No such file


after some monkeying about, it seems that the GIF-LIB error is
typical/common for non-gif /or corrupt images.  these then,
apparently, get Fixed and scanned.

the MLDBM error turned out to be a missing hash db file.  afaict,
fuzzyocr does not create them if missing.  so a 'touch' and correct
perms/own fix that problem.

so, i'm up  running with fuzzyocr, testing/scanning with no errors.


Re: what are default rule priorities?

2006-11-13 Thread snowcrash+spamassassin

 but, what ARE the other rules' priorities?

 is there documentation of that? nothing on the wiki that i've found.

Priorities don't exist in released versions SA, only the 3.2 development
branch.


as i understand it, fuzzyocr -- which runs with v3.1.x (SpamAssassin
3.1.4 or higher)-- specifically relies on priority to ensure that it
runs 'last'.

do i understand you correctly that it, then, has no effect with
v3.1.x? (seems to be working on my system ...)


As for what's in the devel branch, well, that changes regularly. If
you're using development snapshots, you should be comfortable reading
the source code, so that's where you should head.


i'm not using the dev branch/head.

i'm using a svn co of the 31 branch, which i understand is the 317
release, plus bug fixes etc.

thanks.


Re: what are default rule priorities?

2006-11-13 Thread snowcrash+spamassassin

Priorities have existed for a while.  3.2 will have short circuit
capabilities, which is recommended to be combined with changing
priorities.


ok.

thanks.