On Tuesday 31 May 2016 at 15:47:56, Shivram Krishnan wrote: > I am using SA as an oracle for Blacklisting. Our research concerns with > combining multiple sources of blacklist and also consider the historical > importance of an IP in a blacklist to create a very effective master > blacklist. > > Let me give you an example. > Suppose an IP address 1.2.3.4 appeared on Jan 1 ,2016 in Blacklist A. > 1.2.3.4 stayed on Blacklist A for about 12 hours. > > We have developed a system which assigns a score to 1.2.3.4. If the score > allocated to 1.2.3.4 is high we include it in our Master Blacklist. > > To evaluate the performance of the master Blacklist in terms of hitrate and > false positives we plan to use SA.
How do you plan to find out when 1.2.3.4 appeared on the blacklist, and how long for, if you are not dealing with live mail flowing through a real mail server? Dealing with email "after the event" (especially with regard to blacklists) will give you very different results from dealing with it as it happens, if for no other reason than the spam which you and lots of other mail admins receive is the very trigger which causes the IP address to go onto the blacklist. I think you should try to show in advance that your methods, and what you are measuring, are a valid way of assessing spam from different addresses, in order for the study to be useful. Regards, Antony. -- If you were ploughing a field, which would you rather use - two strong oxen or 1024 chickens? - Seymour Cray, pioneer of supercomputing Please reply to the list; please *don't* CC me.