Steve schreef: > -------- Original-Nachricht -------- >> Datum: Fri, 31 Jul 2009 11:17:24 -0700 >> Von: "Michael Watkins" <[email protected]> >> An: [email protected] >> Betreff: Re: [Dspam-user] RBL Configuration > > btw: Since you are using Geo-IP... I could extend the Geo-IP patch to allow > scoring by distance. I came to that idea after reading about SNARE > (http://www.technologyreview.com/communications/23086/). It is actually > pretty easy to do the calculation of the distance. Just out of curiosity I > coded quickly a Perl script using Geo::IP to extract the latitude and > longitude of your host (solutionroute.ca) and the same info for > www.sourceforge.net and then display some info (so I just know that I did it > right in Geo::IP) and then compute the distance in Kilometers. This is the > result: > ------- > Info for www.sourceforge.net > Country Code: US > Country Code3: USA > Country Name: United States > Region: CA > Region (Name): California > City: Mountain View > Postal Code: 94041 > Latitude: 37.3885 > Longitude: -122.0741 > Time Zone: America/Los_Angeles > Area Code: 650 > Continent Code: NA > Continent Name: North America > Metro Code: 807 > > Info for solutionroute.ca > Country Code: US > Country Code3: USA > Country Name: United States > Region: MO > Region (Name): Missouri > City: Kansas City > Postal Code: 64106 > Latitude: 39.1068 > Longitude: -94.5660 > Time Zone: America/Chicago > Area Code: 816 > Continent Code: NA > Continent Name: North America > Metro Code: 616 > > Distance in Km: 2400.5724862323 > ------- > > I used the free available GeoLiteCity.dat > (http://geolite.maxmind.com/download/geoip/database/) to get the extended > data. > > I have not added that jet to policyd-weight but I am really tempted to add > it. What I don't know jet is how to make the lookup table? The problem I see > with the lookup table is that I just have the distance and I need to score if > a certain distance is reached but look at this example: > ---- > @distance_score = ( > # DISTANCE IN KM, NO MATCH, MATCH, LOG NAME > "1000", -0.50, 0.50, "1000_KM", > "2000", -0.50, 1.00, "2000_KM", > "4000", -0.50, 1.50, "4000_KM", > "8000", -0.50, 2.00, "8000_KM", > "16000", -0.50, 2.50, "16000_KM", > ); > ---- >
I think this is a bit far-sought. This expects the sysadmin to do some
advanced math depending on his location. For Europa, major spam
locations (f.i. BR, US, CN) are "far away", but when you're in US, the
table above does not work. Recalculation of the table is then based upon
already known data: known spam countries.
Classifying on country is cheaper in cpu-time, and enables you to
actually target known sources.
For example:
- get the sender's ISP name from whois
- get results for google image search "ISPname sysadmin"
- use face recognition
- measure sysadmins beard length
- more beard -> lower spam score
Just a random example to show that you can do really cool stuff with
statistics (use your imagination!), but without much actual use. Without
a doubt, it is a cool idea to compute real life distance, but I think
that relevance vs. efficiency is a bit off :)
--
Regards,
Tom
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________ Dspam-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspam-user
