Theo Van Dinter wrote:
Sorry to be a killjoy here.
I have no problem with the criticism, but I think I've hit the end of what
I'm going to do on this one now that it's working without breaking anything.
I'm running out of time for some schoolwork that's due in a month and will
have to
Daniel Quinlan said:
I propose that we make a 3.0.0 release
Are we going to be able to close bug 3675 first (he asks innocently, after
having made trouble on reaching consensus on that very bug :-) )? That's
the only one left marked with a 3.0 target.
-- sidney
Daniel Quinlan said:
Yes. It's 4-to-3 in favor of orange
+1 for 3.0 release!
(Note: someone may want to address
[http://it.slashdot.org/comments.pl?sid=122734cid=10320250 these
complaints] about this document.)
I did post a response in
http://it.slashdot.org/comments.pl?sid=122734cid=10325272 [Anyone got
some spare mod points? :-) ].
There is one issue I missed and I
I stumbled across this article
http://www.macdevcenter.com/pub/a/mac/2004/05/18/spam_pt2.html
while Googling around for anything that relates cluster analysis
techniques to spam filtering.
This may be old knowledge to some people here, but was new to me.
Apparently the trainable spam filter in
Henry Stern wrote:
Apple Mail uses latent semantic analysis
for clustering
That sounds right. Some people there were looking at that for document
retrieval when I worked at Apple Research in the mid-90's.
By the way, have you seen the work applying cased-based reasoning to
spam filtering?
Henry,
In the paper An Assessment of Case-Based Reasoning for Spam Filtering
http://www.comp.dit.ie/sjdelany/publications/AICS%202004%20(crc).pdf
the authors compare CBR and a naive Bayes (NB) with one conclusion (on
their test data, with their implementation of NB) that daily updating of
the
Sidney Markowitz wrote:
caused an improvement in FPs
but a degradation in FN rate
Typo - I left out mention that the result was using NB, and not using CBR.
-- sidney
andrew collier wrote:
i have the following problem when reporting spam
This mailing list is used by SpamAssassin developers to discuss ongoing
development work on SpamAssassin. Your question has nothing to do with that.
Your question is appropriate for the SpamAssassin users mailing list
(see
Justin Mason wrote:
The first fix is truncation of the text before passing to TextCat.
Michael, I think you were looking at this? the results are impressive,
if the text is truncated to 32k bytes:
It was me. I've been looking at ways to not have to create so much
garbage (I'm a lisp hacker --
Added: spamassassin/trunk/t/memory_cycles.t
I just noticed this now while trying to make test on a machine that
doesn't have Devel::Cycle. Is that going to be a documented requirement now?
-- sidney
Justin Mason wrote:
the test should be a no-op without that module did that not work?
This is extracted from output of make test, running under Cygwin with
perl 5.8.5
t/memory_cycles.Can't locate Devel/Cycle.pm in @INC (@INC
contains:
t . ../blib/lib /c/sasvn/trunk/blib/lib
ratan kamath wrote:
Query: If a mail arrives [...]
This mailing list is used by SpamAssassin developers to discuss ongoing
development work on SpamAssassin. Your question has nothing to do with that.
Your question is appropriate for the SpamAssassin users mailing list
(see the SpamAssassin wiki
Fred,
I noticed you mentioned in a bug comment about getting some information
using Ethereal. If you are also running Cygwin, could you help a bit
with bug #3917? I'm stuck because of some firewall issues that I have
not yet tracked down on the home machine where I can test.
What I'm trying to
Alexandr Orlov wrote:
X-Spam-Status: SpamAssassin Failed
It does not appear anywhere within the SpamAssassin source code.
Googling for that exact header showed up a number of messages with it,
all spam. At first I thought it must be a fake header added by some
spammers to try to fool
Daniel Quinlan wrote:
Please try to use the more standard perl formatting:
Do you see anything wrong other than two of the lines being more than 80
characters? I'll check in an update to fix that as soon as I finish
running a make test on the change.
-- sidney
Justin Mason wrote:
Sidney -- I think it's the
foo( bar )
vs.
foo(bar)
I prefer that too. I copied the style that was already in the code, and
I looked for something about that in the style guide and did not see any
mention of it one way or the other. Unless it is there and I missed it,
Daniel Quinlan wrote:
Heh, I was most talking about the paren style, actually, not the line
length (although now that you mention it).
There are a few hundred spaced parens in spamd.raw. I'll fix the lines I
changed if you want, but if it's ok with you I won't do a massive edit
of the file.
Or
Daniel Quinlan wrote:
* No space between function name and its opening parenĀ
thesis.
I did see that. That would allow foo( bar ) which is what I did. If you
want foo(bar) as a preferred style it would have to be added to the wiki
page.
-- sidney
I just tried a quick build and make test in Windows XP to see what it
would do, and
1. I could not reach the svn server from svn, although I could ping it.
Is it down?
2. I got lots and lots of
Use of uninitialized value in concatenation (.) or string at
The error message from ArchiveIterator.pm is because Windows does not
define $HOME environment variable by default. It has $HOMEDRIVE and
$HOMEPATH which together server the same purpose. The code in
ArchiveIterator.pm has to be changed to check for Windows, or else we
can document the need to
Malte S. Stretz wrote:
What does getpwuid() say on Windows?
Not implemented :-)
You can't use getpwuid in Windows. The usual portable implementation
checks for running under Windows and uses $ENV{'HOMEDRIVE'} .
$ENV{'HOMEPATH'} if it is instead of $ENV{'HOME'}, being careful about
the former
Malte S. Stretz wrote:
So maybe we should add a M::SA::Util::get_home() which first
tries $ENV{HOME}, then on Windows $ENV{HOMEDRIVE}\$ENV{HOMEDIR}, then
portable_getpwuid()[7], then... foo?
portable_getpwuid() doesn't seem to do anything useful under Windows for
this purpose and shouldn't be
Malte S. Stretz wrote:
oops :) But I'm glad you didn't notice my HOMEx debugging glitch :)
I did, but I understood what it was for :-)
I spoke too soon about it working. When I add a -w to the perl command
it barfs in catpath, I think because it expects to be passed all three
arguments, volume,
Daniel Quinlan wrote:
[EMAIL PROTECTED] (Justin Mason) writes:
CFP ends in 4 days though.
If the trend in conference quality continues
Oh, then I _do_ have time to design, research, write, and propose a
paper for it!
-- sidney :-)
Loren Wilton wrote:
Doesn't the free VC install include nmake? The normal one does.
No, that's the problem. No nmake, no winsock.h, necessitating two more
big downloads in addition to the free toolkit.
The DDK also includes Nmake, and a considerably newer version than what
Well, I guess that
Any comments? Interest in co-authoring a research paper (*poke*,
I might have some ideas about it... especially if it could be related to
classification of cancer cells based on microarray gene expression data :-)
Now I have something to think about on my ferry commute this morning.
-- sidney
Nick Leverton said that papers he has seen found that learn on error
always works better than learn everything. But I recall one that looked
more carefully at longer term results and found that learn on error
degrades over time. They found it best to retrain on fresh data every
few months. (I
Daniel Quinlan wrote:
Bugzilla says we can release 3.0.2 so I therefore propose we release 3.0.2.
+1!
-- sidney
http://www.sidney.com
signature.asc
Description: OpenPGP digital signature
Justin Mason wrote:
(b) however the -parker- and -sidney- ones *are* getting annoying. ;) I
suggest we turn off those slaves until we can figure out how to get
buildbot to work with dynamic-IP slaves...
I'm running three slaves on one machine, two of them on the same VMWare
virtual machine and
Justin Mason wrote:
Sidney, have you tried setting --keepalive=300
I'll try that. What Michael says does make sense. I'm behind a NAT.
Is there a way of setting a port that the slave listens on? I can
configure my NAT to let the slaves be designated servers on some port if
I can make it a fixed
Justin Mason wrote:
might be worth signing up to buildbot-devel (it's very low traffic)
and mention that...
I'm going away on holiday soon for a couple of weeks. I'll look at that
after I come back. There may be some issues to work out if I'm going to
test with their latest cvs version and
I'm seeing the following in make test in the spf test. It doesn't show
in the buildbot test because they skip SPF. (As an aside, why do they
skip it?)
$ t/spf.t
1..2
# Running under perl version 5.008005 for cygwin
# Current time local: Sun Dec 19 09:49:57 2004
# Current time GMT: Sat Dec 18
I had a power glitch here which rebooted the server. I think it happened
in the middle of the svn update causing all three slave jobs to fail,
and I think that it was a power glitch that caused the reboot. I'm not
going to bother to bring the buildbot slaves online again before I leave
on
William Holman wrote:
I've been over-ruled by those who pay the bills, so I can't use
SpamAssassin since it's open source
What bills? -- It's open source! :-)
If you look at the SpamAssassin wiki you can find a list of products
that are based on SpamAssassin that your billpayers can feel happy
still being mostly ignored. Anyone who
cares about a specific bug report should speak up and make their case for
it.
-- Sidney Markowitz
http://www.sidney.com
of
the SpamAssassin release cycle. SpamAssassin could download the list more
or less often depending on how volatile the list is. My guess is that
monthly is fine, as that is much better than once per SA release cycle.
Sidney Markowitz
http://www.sidney.com
popular antispam program with
enough use to effect spammers' behavior.
Sidney Markowitz
http://sidney.com
Daryl C. W. O'Shea said:
The emails generated could be used to calculate
the domains most often seen.
I would be afraid of it being too easy for malicious people to hack by
sending in false data, DoS attacks on the email addresses, etc. Also
there is no reason to load down some email address
Is anyone else seeing problems accessing the SpamAssassin svn? I can't
connect to the server using svn from my machine and
http://svn.apache.org/viewcvs.cgi/spamassassin/trunk/?root=Apache-SVN
does not respond either. Ping works.
-- sidney
Malte S. Stretz wrote:
Ok, I added some code for this in r153131. Could you please test it (just
do a 'make clean; make'), especially on Windows?
Ok, my Windows machine is working and the disk is mostly restored now,
and I found the right thread to report this...
Malte, the current makefile
Daniel Quinlan wrote:
We support nmake?
That's the Microsoft nmake, not to be confused with any other make
program of the same name. It's what is available on Windows. For
compatibility we have to put all the fancy logic in the perl of
Makefile.PL so the resulting makefile is written to a dumbed
Malte S. Stretz wrote:
Sidney, could you test r154095 on Windows please?
It works. BTW, my buildbot slaves are running again so you can see
immediately, e.g.,
http://bugzilla.spamassassin.org:8010/trunk-sidney-win32/builds/51
-- sidney
for this
problem.
Thanks,
-- Sidney Markowitz
http://www.sidney.com/
signature.asc
Description: OpenPGP digital signature
t/debug.t and t/spf.t both have failures. I'm not sure how long ago they
started failing as the failures are hidden by the warning-only failures
in rule_names.t.
Is there a way that we can distinguish between rule_names and the other
failures so that we can go back to sending notification emails
I fixed the test failure in t/debug.t checking in to r155617.
The test was just missing a new dbg message tag, replacetags, so I added
it to the list.
I'm less sure about what is the correct thing to do for the failure in
t/spf.t. In that case there is a test for SPF_HELO_FAIL in the test
spam.
Justin Mason wrote:
According to the SPF people, we shouldn't
be using -all on a domain that may possible emit mail. So I changed
the record...
That can't be right. Try out the wizard at
http://spf.pobox.com/wizard.html?mydomain=spamassassin.org
It gives you two choices in the last question
Justin Mason wrote:
According to the SPF people, we shouldn't
be using -all on a domain that may possible emit mail
Even if, as I think, ~all is correct if you can enumerate all legal
senders for the domain, there still is a problem with making our test
depend on the current configuration of
Shelby,
This mailing list is for developer discussions. Developers consist of
the people who have commit access to our source control system, SVN.
As per Apache Foundation policies, the development process is
transparent. That means that the technical and design discussions we
developers have
Daniel Quinlan wrote:
aspects of the AL 2.0 don't really translate to services, but use does
and that's my main concern with Razor2.
I find Theo's argument that use of the razor server is always free to a
user of a free SA distribution compelling.
Code being free but charging for service is in
I vote +0.5 for Fri Mar 11.
I'm voting for that date because it is a weekend here on the other side
of the world, which is the only time I can do anything.
I'm only voting 0.5 because I probably still won't have much time, even
on a weekend :-(.
-- sidney
Frederik Eaton wrote:
Is it possible to configure spamassassin to get back the original
functionality of only modifying headers of spam
1. Look up the doc on rewrite_header and report_safe in man
Mail::SpamAssassin::Conf or other documentation
2. Any further questions about this or similar
Daryl C. W. O'Shea wrote:
Shouldn't people evaluate whether or not they are eligible to use Razor2
before downloading (and installing) the razor-agents from Vipul's website?
That was the substance of the reply I tried to write last night but was
too sleepy to finish.
I thought about how I
Duncan Findlay wrote:
That's arguably a bug in the operating system then
I don't think it is even that, but I agree with you that it is not our
place to work around it.
Consider this: Razor is free to use if the client software is free. The
client module may come freely with the OS. The client
Daniel and SpamAssassin are on Slashdot!
http://it.slashdot.org/article.pl?sid=05/03/04/2010218tid=111
-- sidney
Shelby Moore wrote:
Sidney Markowitz wrote:
This mailing list is for developer discussions. I could try to explain
what that means, but I'm afraid that you may not have the awareness of
personal or social boundaries to be able to use the explanation.
There you go again trying to ERRONEOUSLY
Frederik Eaton wrote:
As developers, you might want to add that information to the
part of the man page I quoted
I assume that you are referring to the released version of SpamAssassin.
Looking at out latest development version I see that the wording has
already been changed to make that
Frederik Eaton wrote:
Also, with all due respect, you really didn't have to be
such an asshole
Reading my words quoted back to me, I agree. The question as you asked
it was more appropriate for the users list. My response to that effect
was posted to the list because people reading this list
Thanks for your interest in helping improve things, but please read
http://wiki.spamassassin.org/DoYouWantMySpam
for the FAQ about not sending spam samples to our mailing lists.
-- sidney
Tony Finch wrote:
Is anyone planning to implement CSA for SpamAssassin?
I'm not, but I do have a question about it. Is it something that would
best be implemented on the MTA to reject fake SMTP servers, or does it
have a maybe case which would be best handled by a SpamAssassin rule
without
Duncan Findlay wrote:
That's a pretty significant change for a maintenance release.
Yes, and I mention it to bring it to his attention. I guess it's up to him
to decide whether or not to back port the patch, and then it is up to us
whether to accept it in an official 3.0.3 release, just like it
[EMAIL PROTECTED] wrote:
Added testcase from Bug4191
This test fails on my Fedora Core 3 system with svn trunk even though bug
4191 has a comment that says that it is fixed in 3.1.
t/uri...FAILED test 77
Failed 1/76 tests, 98.68% okay
-- sidney
Theo Van Dinter wrote:
I have ~300K of them. http://www.kluge.net/~felicity/set1.txt
This should not be happening anymore since the patch for bug #4260 was
committed to trunk. Are you still getting them? The warning was only there
to help us track down that problem.
If we are sure that the
Theo Van Dinter wrote:
The output is from my Saturday weekly net run. It looks like 4260 was
committed as r161778, the nightly run was r164362.
Yuck, this looks like you are still getting DNS records in the wrong order.
Look at that first log entry. It says that a query for
Theo Van Dinter wrote:
It looks like 4260 was
committed as r161778, the nightly run was r164362.
Do people think we should reopen 4260?
This could happen if the random ID isn't random enough or 16 bits isn't
large enough to avoid collisions. I don't see how that would happen if
different
Daniel Quinlan wrote:
It's not used in the t test itself.
Thanks, that helps. I suspect that whatever is causing the hang in bug 4278
has a symptom of a DNS query failing without hanging when it doesn't hang.
Now that I know that the $bind variable has nothing to do with it I can
track that
Loren Wilton wrote:
How about a simple debug printout of the id value sent and the id value
received? Maybe it is as simple as the id matching code is failing.
That's definitely a better idea considering that there is a bug in the patch
I posted that prevents any of the DNS stuff from working
This is the corrected patch that ensures that IDs are not colliding by
including the host name in an SHA1 hash with the 16 bit ID counter.
It is written a bit crudely, but if Theo or someone else who is seeing the
problem would try this in a mass test it would demonstrate whether the
problem has
Matt Sergeant wrote:
May be a problem with forking. Here's part of the fork replacement I use
in my code that uses the single-packet-DNS stuff:
Justin's code generates a number from the pid to initialize the ID
counter and keeps track of it itself instead of relying on the Net::DNS
code. Are
Theo Van Dinter wrote:
The patch does make things *much* slower though, around 3x:
[...]
Without the patch, lots of issues starting after 80%.
I don't claim that the patch is the most efficient way of dealing with it...
I just wanted to use SHA1 to ensure that there was no chance of an ID
Sidney Markowitz wrote:
use something that combines the pid and id
Brain fade... This patch works by matching information that is in the reply
packet to information in the query packet, which means it has to use the
host name and the packet ID. Duh! Sorry.
Still, we could try some debug log
I haven't been running mass-checks until now, but I just tried it with svn
trunk and got a couple of bogus rr warnings so far between the 50% and 60%
marks so far. It's taken two and a half hours to get that far, so this is a
very slow process. I just shut down the vmware session that was running
Matt Sergeant wrote:
I didn't think you could do that because in newer versions of Net::DNS
the id is a lexical variable. The only way to reinitialise it is to
reload the module.
If I remember it correctly, Justin's code keeps its own counter and sets the
packet ID after creating the packet,
Sidney Markowitz wrote:
I guess that would break down if there are any uses of Net::DNS by the same
process that do not go through his code
grep doesn't find any other use of Net::DNS :-(
I just got another 10 bogus rr hits between the 60% and 70% marks on my mass
test run. I wonder what
Matt Sergeant wrote:
May be a problem with forking
Do you think that this code fragment I see in SpamAssassin.pm should work as
well as your fork code, or could relying on this be part of the problem?
sub init {
my ($self, $use_user_pref) = @_;
# Allow init() to be called multiple times,
Theo Van Dinter wrote:
I'm trying a small patch which basically calls the reinit function when
the counter wraps to 0, as well as using rand when initializing. This way
it'll get a new random starting point and a new socket occasionally.
I think I understand the problem now. It's similar to
Sidney Markowitz wrote:
I think it would be better to create the new socket with each message. If
old replies are arriving as they seem to, wouldn't it be more efficient to
not have a listener on the socket when they arrive?
I got confused when I reread this, so I thought I should clarify
I'm going to respond to yours and John Gardiner Myers replies in the bug
4260 discussion to keep everything tracked there now that I've re-opened the
bug.
-- sidney
Loren Wilton wrote:
Depending on the value of the parameter
that Perl is deducing from that statement, you may or may not be getting the
results you expect.
From the doc:
srand
Sets the random number seed for the rand operator. If EXPR is omitted,
uses a semi-random value based on the
Warren Togami wrote:
Why bother pushing another tarball just for a single patch that
affects only one distribution?
If I understand the preceding discussion correctly this is not a matter of
release early, release often carried to an extreme. It is an abort of the
release process for 3.0.3
I just saw this in a make test in Win32 that I am running right now.
I'm posting this to sa-dev because I have to go to sleep before the make
test finishes and so cannot see if it dies the same in Cygwin or elsewhere,
and I can't look at it right now:
t\meta..'..' is not
Sleep and kids don't always go together.
Here's the other test that failed in Win32, posted here in case anyone can
do anything with it. It works in Cygwin. After I post, I _will_ sleep...
t\bayessdbm.ok 48/52# Failed test 49 in t\bayessdbm.t at \
line 262
Sidney Markowitz wrote:
The correct fix for 3.0 branch, assuming that spf.t there is still testing a
DNS record over which we have no control
Hmm, I looked. It doesn't. I'm downloading 3.0 branch now to see what is wrong.
-- sidney
Now I remember what happened. We weren't using something like aol.com we
were using spamassassin.org, our real spf record. We changed it to do more
of the right thing and broke the test that counted on its old value.
I'll look into it more to see if there is a way to make it the test work the
way
I just did a little experiment. I placed an entry for the ip address of one
of my web servers in /etc/hosts (or rather the Windows equivalent of it on
my PC) with host name www_host.exam_ple.com. I emailed myself a message
containing the text http://www_host.exam_ple.com
When I looked at the
Frederik Eaton wrote:
How are the rule weights for spamassassin generated? There is a method
called boosting
The rule weights are generated using a single-layer perceptron, as described
in the wiki link that Daniel mentioned.
I'm writing a paper this semester [I hope :-)] looking at the
Fred wrote:
There was similar work being done in the past to identify rules to be
grouped into new meta rules, this (w|c)ould achieve similar results.
http://bugzilla.spamassassin.org/show_bug.cgi?id=1363
I think I'm missing something here. Are you saying that automatically
grouping rules into
Theo Van Dinter wrote:
-1
Don't use M::SA unless its necessary (no reason to load a bajillion
things). Just use M::SA::Message.
I see that Mail::SpamAssassin-parse just calls
Mail::SpamAssassin::Message-new and returns the Message object. Is this the
correct syntax to use then instead of
Does anyone have any objection to my checking in the following change? It
makes the code in Dns.pm independent of the format of the key that is used
to check the reply packets so that it will be easier to play with using
different keys such as hashes by changing only code in DnsResolver.pm.
--
Justin Mason wrote:
looks fine to me -- however there are other calls to that bgsend() method
elsewhere. it may need to be made there too.
Good point. I forgot to grep to make sure I wan't missing anything. Make
test didn't show problems, but it wouldn't until I actually tried to
change from
It's funny except I'm getting one of those challenge messages for each
one I send to this list. I don't want to give in to that crap by
responding to register my email address with a stranger. I guess that's
what blacklist-from is for.
I wonder if that service has the ability to whitelist-to a
Is the build broken or is it something I screwed up locally? mimheader.t and
uri_html.t are breaking when I run them:
$ t/mimeheader.t
1..2
# Running under perl version 5.008006 for cygwin
# Current time local: Wed May 11 23:12:38 2005
# Current time GMT: Wed May 11 11:12:38 2005
# Using
Justin Mason wrote:
Are there tests in the test suite for the redirector usage case btw?
Excuse me if I'm misunderstanding the question in my fog-before-first-coffee
of the morning...
The redirector patterns are hardcoded in sub try_canon in uri.t so any
change to them in 20_uri.cf has to be
Theo Van Dinter wrote:
I don't know if this is a known issue, but it seems like tests 1-18
fail (of
22) for t/dnsbl.t ... From what I can see, most of the lookups
timeout at
15s which blows the tests out of the water.
I've been seeing that with varying regularity depending on which network
I got some debugging output and it looks like something is quite wrong, but
I don't have time to look at it right now. Maybe tonight or tomorrow if
nobody else catches it first.
-- sidney
signature.asc
Description: OpenPGP digital signature
Theo,
Could you try running with the bogus rr for domain warn statement in
URIDNSBL modified to output $packet and $ent-{id} instead of
$packet-header-id? That will make the warning message a bit more
verbose, but you aren't seeing that many of them anyway, and it will
provide helpfule debug
Justin Mason said:
mystery solved ;)
Aww, I was looking forward to tracking down a really mysterious bug :)
-- sidney
Does anyone know if we should be able to use the latest version of
Buildbot, 0.6.5 with buildbot.spamassassin.org? I know that I could just
try it, but I don't want to spend time trying to get it to work only to
find that the master has to be upgraded first.
-- sidney
Justin Mason said:
yep, should be possible -- create a t/config file that enables it in
the buildbot slave's checkout ;)
I thought that's under the control of the master. Doesn't the script
recreate the entire trunk every time? Oh, of course that would be too
expensive. Ok, I'll edit the
Justin Mason said:
afaik you can.
Ok, I'll try it. First I'll confirm that I can get the 0.6.2 that I have
installed running again, as I've had it down for a while.
Another question -- Can we have a way of enabling network test for the
buildbot runs? I can see how it should be an option, as
1 - 100 of 439 matches
Mail list logo