On 27.08.2011 11:53, Ladar Levison wrote:
> While testing my code I stumbled upon a couple bugs. My test corpus 
> includes a number of messages that break formatting rules. The rogue 
> test messages allowed me to find a potential memory leak and a 
> strcmp() call that reads uninitialized bytes. Attached is a valgrind 
> log showing the bugs and a patch to fix the problems.
>
Should be fixed in GIT


> By pumping emails through the library I was able to look for bugs that 
> might trigger a crash but I still don't have a way to test whether 
> DSPAM is classifying emails correctly. Can anyone point me to a 
> standardized test corpus
Most people use the Apache SpamAssassin corpi for testing or the TREC corpi.

> and the scripts/tools needed to test that corpus using different DSPAM 
> configs?
You could use dspam_train and use dspam_stats to set/reset the snapshot.

> The doc/tests.txt file shows results from 2009 but doesn't say how to 
> reproduce the experiment. If my own test results matched what the core 
> DSPAM developers got I'd be a happy code monkey... 
I don't understand what you mean with this? Are you trying to get a 
certain score/result that you can compare with the other DSPAM 
users/developers?
I don't know how other benchmark their setup (and if they even do 
benchmark their setup)? I myself have developed over the years my own 
testing and training method. I don't use stock DSPAM methods at all. I 
guess other DSPAM users/admins have established their own test and 
training procedures as well.

> And if I could trigger a test run via `make check` I'd be the happiest 
> code monkey in the bazaar.
This is difficult since the backend is configurable with ./configure but 
it is most likely not initialized and a 'make check' would require to 
have a properly configured backend (with all the schema and access 
already setup), which is not available on a fresh/new setup during 
compile time.


------------------------------------------------------------------------------
EMC VNX: the world's simplest storage, starting under $10K
The only unified storage solution that offers unified management 
Up to 160% more powerful than alternatives and 25% more efficient. 
Guaranteed. http://p.sf.net/sfu/emc-vnx-dev2dev
_______________________________________________
Dspam-devel mailing list
Dspam-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Reply via email to