Firstly: Hi, First posting to the list. 

Secondly:

> It seems to work well but isn't based on much more than a whim and a
little observation.  I get very few ham hits on _my_ mail with it, but >
I mainly get pretty clean looking ham.

This seems like a sensible approach in general. However, it should be
possible to employ a more 'justifiable' logic. My university degree was
in Artificial Intelligence and Software Engineering, and I covered a
number of topics that might match.

Linear Dimension Reduction is a technique for identifying the
'significant' dimension of a set of vectors. If each mail is treated as
a vector, then this might help to identify which messages commonly occur
together. I'm thinking about this as a way of improving our own spam
filtering.

Another option that might have some relevance is something similar to
www.touchgraph.com If the messages were mapped using this software,
identifying 'clusters' might be possible, and hence improve the accuracy
of your structure.

Anyway, I'm just dumping some ideas that I've had recently. I'm
investigating them all, but if anyone is interested in helping or
whatever, drop me a line personally and we can bash our heads together
for a while.

Richard




---------------------------------------------------
This email from dns has been validated by dnsMSS Managed Email Security and is 
free from all known viruses.

For further information contact [EMAIL PROTECTED]




Reply via email to