Steve schreef:

>>
> What? It does not expect the system administrator to do advanced math. The 
> module it self would do the math. It just needs to know what longitude and 
> latitude to use as the starting point (for you that would probably be 
> somewhere in NL). And every good system administrator should now how to get 
> his longitude and latitude. That's no rocked science (every system 
> administrator had sure once seen or used LOC in DNS).

> Please read the SNARE study mentioned above. Especially this part here:
> -----
> Furthermore, the researchers found that by plotting the geodesic
distance between the Internet Protocol (IP) addresses of the sender and
receiver--measured on the curved surface of the earth--they could
determine whether the message was junk.
> -----

What I meant was, that the expectation that long-distance e-mail is more
likely to be spam is not equally valid for everyone. When you use the
ideas of the SNARE report (which I read before my post), the expectation
is correct. But depending on your actual location, IMHO the results of
this test can be merely noise, or a small added value.

But the actual outcome of this specific test was not my point.

>>
>> Classifying on country is cheaper in cpu-time, and enables you to
>> actually target known sources.
>>
> The computation of the distance is cheep in CPU time. It is just a bunch of 
> acos, cos, sin, etc calls and a bunch of multiplications and additions and 
> subtractions. In fact it is just one line in Perl code.
> 
> You are pulling hairs on a imaginary example. Where is that study
showing that the above computations are relevant in SPAM fighting?

The point is that I can think up many statistical relationships between
spam and its characteristics (academically tested or not), and computing
them can keep my spam filtering machine busy all day.
But I don't think that adding 200 extra tests to your setup increases
accuracy, it only increases cpu-cycles: result is that the efficiency
vs. accuracy challenge is lost.

IMHO, you should stick with a few methods that are effective, and work
with them. Testing experimental stuff is fine, but adding (i.e. not
replacing) methods that add 0.1% accuracy without thinking about
efficiency is not worth all the work.

>
> But hey! I just tried to be helpful by offering to include the computation of 
> real distance in policyd-weight. I did not said that the computation would be 
> the next big thing preventing SPAM on your system. Okay. Since this is DSPAM 
> mailing list I should stop posting about other solutions here...

That's great, and thank you for that. I just meant to add a sidenote
about efficiency :)

This thread is now dead :)

--
Regards,
        Tom

Attachment: signature.asc
Description: OpenPGP digital signature

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

Reply via email to